2022-08-17T12:28:36.7812862Z Requested labels: linux.8xlarge.nvidia.gpu 2022-08-17T12:28:36.7812969Z Job defined at: pytorch/pytorch/.github/workflows/_linux-test.yml@refs/pull/82657/merge 2022-08-17T12:28:36.7812998Z Waiting for a runner to pick up this job... 2022-08-17T12:30:18.4557052Z Job is about to start running on the runner: i-02fdd1ace63d4e018 (repository) 2022-08-17T12:30:24.1830298Z Current runner version: '2.295.0' 2022-08-17T12:30:24.1838134Z Runner name: 'i-02fdd1ace63d4e018' 2022-08-17T12:30:24.1838806Z Runner group name: 'Default' 2022-08-17T12:30:24.1839592Z Machine name: 'ip-10-0-4-249' 2022-08-17T12:30:24.1842392Z ##[group]GITHUB_TOKEN Permissions 2022-08-17T12:30:24.1843364Z Actions: read 2022-08-17T12:30:24.1843771Z Checks: read 2022-08-17T12:30:24.1844203Z Contents: read 2022-08-17T12:30:24.1844699Z Deployments: read 2022-08-17T12:30:24.1845089Z Discussions: read 2022-08-17T12:30:24.1845514Z Issues: read 2022-08-17T12:30:24.1845941Z Metadata: read 2022-08-17T12:30:24.1846315Z Packages: read 2022-08-17T12:30:24.1846794Z Pages: read 2022-08-17T12:30:24.1847233Z PullRequests: read 2022-08-17T12:30:24.1847673Z RepositoryProjects: read 2022-08-17T12:30:24.1848167Z SecurityEvents: read 2022-08-17T12:30:24.1848662Z Statuses: read 2022-08-17T12:30:24.1849051Z ##[endgroup] 2022-08-17T12:30:24.1853441Z Secret source: None 2022-08-17T12:30:24.1854254Z Prepare workflow directory 2022-08-17T12:30:24.3152627Z Prepare all required actions 2022-08-17T12:30:24.3373270Z Getting action download info 2022-08-17T12:30:24.5550928Z Download action repository 'pytorch/pytorch@master' (SHA:2a096e940d33a33c4eb6df1c2ed4da607bd31a7f) 2022-08-17T12:30:27.8449115Z Download action repository 'nick-fields/retry@71062288b76e2b6214ebde0e673ce0de1755740a' (SHA:71062288b76e2b6214ebde0e673ce0de1755740a) 2022-08-17T12:30:27.9517942Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:3c1d75049465d7dfa70acca6d80b9c5c06ff4886) 2022-08-17T12:30:28.2511687Z Getting action download info 2022-08-17T12:30:28.4569753Z Download action repository 'malfet/checkout@silent-checkout' (SHA:f63e9e15406be6060f159846cd2e098f759c5246) 2022-08-17T12:30:28.8655303Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@master 2022-08-17T12:30:28.8655708Z with: 2022-08-17T12:30:28.8655977Z submodules: recursive 2022-08-17T12:30:28.8656217Z fetch-depth: 0 2022-08-17T12:30:28.8656449Z env: 2022-08-17T12:30:28.8656703Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:30:28.8656956Z ##[endgroup] 2022-08-17T12:30:28.8959510Z ##[group]Run retry () { 2022-08-17T12:30:28.8959833Z retry () { 2022-08-17T12:30:28.8960146Z  $* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*) 2022-08-17T12:30:28.8960449Z } 2022-08-17T12:30:28.8960690Z echo "${GITHUB_WORKSPACE}" 2022-08-17T12:30:28.8960986Z if [ -z "${NO_SUDO}" ]; then 2022-08-17T12:30:28.8961297Z  retry sudo rm -rf "${GITHUB_WORKSPACE}" 2022-08-17T12:30:28.8961576Z else 2022-08-17T12:30:28.8961834Z  retry rm -rf "${GITHUB_WORKSPACE}" 2022-08-17T12:30:28.8962099Z fi 2022-08-17T12:30:28.8962398Z mkdir "${GITHUB_WORKSPACE}" 2022-08-17T12:30:28.8979834Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-08-17T12:30:28.8980160Z env: 2022-08-17T12:30:28.8980408Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:30:28.8980645Z NO_SUDO: 2022-08-17T12:30:28.8980892Z ##[endgroup] 2022-08-17T12:30:28.9209199Z /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-08-17T12:30:29.0271210Z ##[group]Run malfet/checkout@silent-checkout 2022-08-17T12:30:29.0271590Z with: 2022-08-17T12:30:29.0271900Z ref: ce6a3c605df99d1df57c0dda75c06d748e54ed2a 2022-08-17T12:30:29.0272236Z fetch-depth: 0 2022-08-17T12:30:29.0272535Z submodules: recursive 2022-08-17T12:30:29.0272822Z quiet-checkout: true 2022-08-17T12:30:29.0273142Z repository: pytorch/pytorch 2022-08-17T12:30:29.0273648Z token: *** 2022-08-17T12:30:29.0273948Z ssh-strict: true 2022-08-17T12:30:29.0274241Z persist-credentials: true 2022-08-17T12:30:29.0274548Z clean: true 2022-08-17T12:30:29.0274829Z lfs: false 2022-08-17T12:30:29.0275107Z set-safe-directory: true 2022-08-17T12:30:29.0275396Z env: 2022-08-17T12:30:29.0275679Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:30:29.0275961Z ##[endgroup] 2022-08-17T12:30:29.1806778Z Syncing repository: pytorch/pytorch 2022-08-17T12:30:29.1808766Z ##[group]Getting Git version info 2022-08-17T12:30:29.1809630Z Working directory is '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2022-08-17T12:30:29.1810297Z [command]/usr/bin/git version 2022-08-17T12:30:29.1810613Z git version 2.37.1 2022-08-17T12:30:29.1817604Z ##[endgroup] 2022-08-17T12:30:29.1839628Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/b9251ef2-7b87-46e5-9cb3-90249225111f' before making global git config changes 2022-08-17T12:30:29.1840252Z Adding repository directory to the temporary git global config as a safe directory 2022-08-17T12:30:29.1848030Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-08-17T12:30:29.1891104Z Deleting the contents of '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2022-08-17T12:30:29.1896655Z ##[group]Initializing the repository 2022-08-17T12:30:29.1902899Z [command]/usr/bin/git init /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-08-17T12:30:29.1935368Z hint: Using 'master' as the name for the initial branch. This default branch name 2022-08-17T12:30:29.1935843Z hint: is subject to change. To configure the initial branch name to use in all 2022-08-17T12:30:29.1936329Z hint: of your new repositories, which will suppress this warning, call: 2022-08-17T12:30:29.1936690Z hint: 2022-08-17T12:30:29.1937089Z hint: git config --global init.defaultBranch 2022-08-17T12:30:29.1937424Z hint: 2022-08-17T12:30:29.1937858Z hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and 2022-08-17T12:30:29.1938388Z hint: 'development'. The just-created branch can be renamed via this command: 2022-08-17T12:30:29.1938753Z hint: 2022-08-17T12:30:29.1939260Z hint: git branch -m 2022-08-17T12:30:29.1939835Z Initialized empty Git repository in /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/ 2022-08-17T12:30:29.1951129Z [command]/usr/bin/git remote add origin https://github.com/pytorch/pytorch 2022-08-17T12:30:29.1986555Z ##[endgroup] 2022-08-17T12:30:29.1987118Z ##[group]Disabling automatic garbage collection 2022-08-17T12:30:29.1992518Z [command]/usr/bin/git config --local gc.auto 0 2022-08-17T12:30:29.2024117Z ##[endgroup] 2022-08-17T12:30:29.2024634Z ##[group]Setting up auth 2022-08-17T12:30:29.2034903Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2022-08-17T12:30:29.2071778Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || : 2022-08-17T12:30:29.2699284Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2022-08-17T12:30:29.2732267Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || : 2022-08-17T12:30:29.3031554Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2022-08-17T12:30:29.3076149Z ##[endgroup] 2022-08-17T12:30:29.3076706Z ##[group]Fetching the repository 2022-08-17T12:30:29.3085049Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --quiet --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2022-08-17T12:31:17.3745123Z [command]/usr/bin/git rev-parse --verify --quiet ce6a3c605df99d1df57c0dda75c06d748e54ed2a^{object} 2022-08-17T12:31:17.3784358Z [command]/usr/bin/git -c protocol.version=2 fetch --no-tags --prune --quiet --no-recurse-submodules origin ce6a3c605df99d1df57c0dda75c06d748e54ed2a 2022-08-17T12:31:18.4143080Z ##[endgroup] 2022-08-17T12:31:18.4143904Z ##[group]Determining the checkout info 2022-08-17T12:31:18.4145875Z ##[endgroup] 2022-08-17T12:31:18.4146453Z ##[group]Checking out the ref 2022-08-17T12:31:18.4152299Z [command]/usr/bin/git checkout --quiet --force ce6a3c605df99d1df57c0dda75c06d748e54ed2a 2022-08-17T12:31:20.0419010Z ##[endgroup] 2022-08-17T12:31:20.0419776Z ##[group]Setting up auth for fetching submodules 2022-08-17T12:31:20.0426962Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2022-08-17T12:31:20.0481644Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2022-08-17T12:31:20.0516388Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2022-08-17T12:31:20.0549529Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2022-08-17T12:31:20.0578583Z ##[endgroup] 2022-08-17T12:31:20.0579034Z ##[group]Fetching submodules 2022-08-17T12:31:20.0584271Z [command]/usr/bin/git submodule sync --recursive 2022-08-17T12:31:20.0908015Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2022-08-17T12:31:20.1214396Z Submodule 'android/libs/fbjni' (https://github.com/facebookincubator/fbjni.git) registered for path 'android/libs/fbjni' 2022-08-17T12:31:20.1217358Z Submodule 'third_party/NNPACK_deps/FP16' (https://github.com/Maratyszcza/FP16.git) registered for path 'third_party/FP16' 2022-08-17T12:31:20.1220721Z Submodule 'third_party/NNPACK_deps/FXdiv' (https://github.com/Maratyszcza/FXdiv.git) registered for path 'third_party/FXdiv' 2022-08-17T12:31:20.1224768Z Submodule 'third_party/NNPACK' (https://github.com/Maratyszcza/NNPACK.git) registered for path 'third_party/NNPACK' 2022-08-17T12:31:20.1229740Z Submodule 'third_party/QNNPACK' (https://github.com/pytorch/QNNPACK) registered for path 'third_party/QNNPACK' 2022-08-17T12:31:20.1233555Z Submodule 'third_party/XNNPACK' (https://github.com/google/XNNPACK.git) registered for path 'third_party/XNNPACK' 2022-08-17T12:31:20.1237741Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/benchmark' 2022-08-17T12:31:20.1241795Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo.git) registered for path 'third_party/cpuinfo' 2022-08-17T12:31:20.1246068Z Submodule 'third_party/cub' (https://github.com/NVlabs/cub.git) registered for path 'third_party/cub' 2022-08-17T12:31:20.1250684Z Submodule 'third_party/cudnn_frontend' (https://github.com/NVIDIA/cudnn-frontend.git) registered for path 'third_party/cudnn_frontend' 2022-08-17T12:31:20.1254987Z Submodule 'third_party/eigen' (https://gitlab.com/libeigen/eigen.git) registered for path 'third_party/eigen' 2022-08-17T12:31:20.1259682Z Submodule 'third_party/fbgemm' (https://github.com/pytorch/fbgemm) registered for path 'third_party/fbgemm' 2022-08-17T12:31:20.1264908Z Submodule 'third_party/flatbuffers' (https://github.com/google/flatbuffers.git) registered for path 'third_party/flatbuffers' 2022-08-17T12:31:20.1270019Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/fmt' 2022-08-17T12:31:20.1276018Z Submodule 'third_party/foxi' (https://github.com/houseroad/foxi.git) registered for path 'third_party/foxi' 2022-08-17T12:31:20.1281199Z Submodule 'third_party/gemmlowp/gemmlowp' (https://github.com/google/gemmlowp.git) registered for path 'third_party/gemmlowp/gemmlowp' 2022-08-17T12:31:20.1286533Z Submodule 'third_party/gloo' (https://github.com/facebookincubator/gloo) registered for path 'third_party/gloo' 2022-08-17T12:31:20.1291996Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/googletest' 2022-08-17T12:31:20.1297513Z Submodule 'third_party/ideep' (https://github.com/intel/ideep) registered for path 'third_party/ideep' 2022-08-17T12:31:20.1303315Z Submodule 'third_party/ios-cmake' (https://github.com/Yangqing/ios-cmake.git) registered for path 'third_party/ios-cmake' 2022-08-17T12:31:20.1309663Z Submodule 'third_party/ittapi' (https://github.com/intel/ittapi.git) registered for path 'third_party/ittapi' 2022-08-17T12:31:20.1315669Z Submodule 'third_party/kineto' (https://github.com/pytorch/kineto) registered for path 'third_party/kineto' 2022-08-17T12:31:20.1321854Z Submodule 'third_party/nccl/nccl' (https://github.com/NVIDIA/nccl) registered for path 'third_party/nccl/nccl' 2022-08-17T12:31:20.1328199Z Submodule 'third_party/neon2sse' (https://github.com/intel/ARM_NEON_2_x86_SSE.git) registered for path 'third_party/neon2sse' 2022-08-17T12:31:20.1334609Z Submodule 'third_party/nlohmann' (https://github.com/nlohmann/json.git) registered for path 'third_party/nlohmann' 2022-08-17T12:31:20.1341294Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx' 2022-08-17T12:31:20.1348477Z Submodule 'third_party/onnx-tensorrt' (https://github.com/onnx/onnx-tensorrt) registered for path 'third_party/onnx-tensorrt' 2022-08-17T12:31:20.1355257Z Submodule 'third_party/pocketfft' (https://github.com/mreineck/pocketfft) registered for path 'third_party/pocketfft' 2022-08-17T12:31:20.1362195Z Submodule 'third_party/protobuf' (https://github.com/protocolbuffers/protobuf.git) registered for path 'third_party/protobuf' 2022-08-17T12:31:20.1369310Z Submodule 'third_party/NNPACK_deps/psimd' (https://github.com/Maratyszcza/psimd.git) registered for path 'third_party/psimd' 2022-08-17T12:31:20.1376603Z Submodule 'third_party/NNPACK_deps/pthreadpool' (https://github.com/Maratyszcza/pthreadpool.git) registered for path 'third_party/pthreadpool' 2022-08-17T12:31:20.1384016Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/pybind11' 2022-08-17T12:31:20.1392143Z Submodule 'third_party/python-enum' (https://github.com/PeachPy/enum34.git) registered for path 'third_party/python-enum' 2022-08-17T12:31:20.1399809Z Submodule 'third_party/python-peachpy' (https://github.com/malfet/PeachPy.git) registered for path 'third_party/python-peachpy' 2022-08-17T12:31:20.1407633Z Submodule 'third_party/python-six' (https://github.com/benjaminp/six.git) registered for path 'third_party/python-six' 2022-08-17T12:31:20.1415351Z Submodule 'third_party/sleef' (https://github.com/shibatch/sleef) registered for path 'third_party/sleef' 2022-08-17T12:31:20.1423282Z Submodule 'third_party/tbb' (https://github.com/01org/tbb) registered for path 'third_party/tbb' 2022-08-17T12:31:20.1432178Z Submodule 'third_party/tensorpipe' (https://github.com/pytorch/tensorpipe.git) registered for path 'third_party/tensorpipe' 2022-08-17T12:31:20.1440395Z Submodule 'third_party/zstd' (https://github.com/facebook/zstd.git) registered for path 'third_party/zstd' 2022-08-17T12:31:20.1469364Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/android/libs/fbjni'... 2022-08-17T12:31:20.4167976Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FP16'... 2022-08-17T12:31:20.6220776Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FXdiv'... 2022-08-17T12:31:20.8173454Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/NNPACK'... 2022-08-17T12:31:21.1668139Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/QNNPACK'... 2022-08-17T12:31:21.4569212Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/XNNPACK'... 2022-08-17T12:31:26.2249829Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/benchmark'... 2022-08-17T12:31:26.6364629Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cpuinfo'... 2022-08-17T12:31:27.1985568Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cub'... 2022-08-17T12:31:28.6416170Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cudnn_frontend'... 2022-08-17T12:31:29.9137113Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/eigen'... 2022-08-17T12:31:36.7277344Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm'... 2022-08-17T12:31:37.4016046Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flatbuffers'... 2022-08-17T12:31:39.0019617Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fmt'... 2022-08-17T12:31:40.1590693Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/foxi'... 2022-08-17T12:31:40.3730885Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gemmlowp/gemmlowp'... 2022-08-17T12:31:40.8847156Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gloo'... 2022-08-17T12:31:41.2097196Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/googletest'... 2022-08-17T12:31:42.1799270Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep'... 2022-08-17T12:31:42.5867419Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ios-cmake'... 2022-08-17T12:31:42.7980835Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ittapi'... 2022-08-17T12:31:43.1476146Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto'... 2022-08-17T12:31:44.9556539Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/nccl/nccl'... 2022-08-17T12:31:45.4139869Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/neon2sse'... 2022-08-17T12:31:45.8045360Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/nlohmann'... 2022-08-17T12:31:52.0278387Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx'... 2022-08-17T12:31:54.0004513Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt'... 2022-08-17T12:31:54.4494340Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pocketfft'... 2022-08-17T12:31:54.7001853Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf'... 2022-08-17T12:31:59.9118805Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/psimd'... 2022-08-17T12:32:00.1129408Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pthreadpool'... 2022-08-17T12:32:00.3434889Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pybind11'... 2022-08-17T12:32:01.2006142Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-enum'... 2022-08-17T12:32:01.4531194Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-peachpy'... 2022-08-17T12:32:01.9919053Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-six'... 2022-08-17T12:32:02.3771517Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/sleef'... 2022-08-17T12:32:02.9781979Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tbb'... 2022-08-17T12:32:05.3714268Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe'... 2022-08-17T12:32:05.9320434Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/zstd'... 2022-08-17T12:32:08.0283432Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2022-08-17T12:32:08.0409097Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2022-08-17T12:32:08.0504461Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2022-08-17T12:32:08.0776851Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2022-08-17T12:32:08.1049936Z Submodule path 'third_party/QNNPACK': checked out '7d2a4e9931a82adc3814275b6219a03e24e36b4c' 2022-08-17T12:32:08.8453932Z Submodule path 'third_party/XNNPACK': checked out 'ae108ef49aa5623b896fc93d4298c49d1750d9ba' 2022-08-17T12:32:08.8700948Z Submodule path 'third_party/benchmark': checked out '0d98dba29d66e93259db7daa53a9327df767a415' 2022-08-17T12:32:08.9921843Z Submodule path 'third_party/cpuinfo': checked out '5916273f79a21551890fd3d56fc5375a78d1598d' 2022-08-17T12:32:09.0318677Z Submodule path 'third_party/cub': checked out 'd106ddb991a56c3df1b6d51b2409e36ba8181ce4' 2022-08-17T12:32:09.3865721Z Submodule path 'third_party/cudnn_frontend': checked out '43709ab96c47e26eebcdac72f93f946d44ceffa8' 2022-08-17T12:32:09.6742870Z Submodule path 'third_party/eigen': checked out '3147391d946bb4b6c68edd901f2add6ac1f31f8c' 2022-08-17T12:32:09.7277278Z Submodule path 'third_party/fbgemm': checked out '499cd22f5c2e26041e4f190f628b48478a89a030' 2022-08-17T12:32:09.7295997Z Submodule 'third_party/asmjit' (https://github.com/asmjit/asmjit.git) registered for path 'third_party/fbgemm/third_party/asmjit' 2022-08-17T12:32:09.7300424Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo) registered for path 'third_party/fbgemm/third_party/cpuinfo' 2022-08-17T12:32:09.7305224Z Submodule 'third_party/googletest' (https://github.com/google/googletest) registered for path 'third_party/fbgemm/third_party/googletest' 2022-08-17T12:32:09.7310068Z Submodule 'third_party/hipify_torch' (https://github.com/ROCmSoftwarePlatform/hipify_torch.git) registered for path 'third_party/fbgemm/third_party/hipify_torch' 2022-08-17T12:32:09.7336174Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/asmjit'... 2022-08-17T12:32:10.9755771Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/cpuinfo'... 2022-08-17T12:32:11.5597661Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/googletest'... 2022-08-17T12:32:12.5205789Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/hipify_torch'... 2022-08-17T12:32:12.8437767Z Submodule path 'third_party/fbgemm/third_party/asmjit': checked out 'd3fbf7c9bc7c1d1365a94a45614b91c5a3706b81' 2022-08-17T12:32:12.9661389Z Submodule path 'third_party/fbgemm/third_party/cpuinfo': checked out 'ed8b86a253800bafdb7b25c5c399f91bff9cb1f3' 2022-08-17T12:32:13.0340706Z Submodule path 'third_party/fbgemm/third_party/googletest': checked out 'cbf019de22c8dd37b2108da35b2748fd702d1796' 2022-08-17T12:32:13.0453551Z Submodule path 'third_party/fbgemm/third_party/hipify_torch': checked out '1840658c184f3eeba787dae0f06c45756c1daaf5' 2022-08-17T12:32:13.1527032Z Submodule path 'third_party/flatbuffers': checked out 'd0cede9c90c5257537c293517a21376408b549fa' 2022-08-17T12:32:13.1916931Z Submodule path 'third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2022-08-17T12:32:13.2015592Z Submodule path 'third_party/foxi': checked out 'c278588e34e535f0bb8f00df3880d26928038cad' 2022-08-17T12:32:13.2477377Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2022-08-17T12:32:13.2759717Z Submodule path 'third_party/gloo': checked out '5b143513263133af2b95547e97c07cebeb72bf72' 2022-08-17T12:32:13.3304077Z Submodule path 'third_party/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2022-08-17T12:32:13.3427390Z Submodule path 'third_party/ideep': checked out '8a114a51c116b55c4ceb689b98746786bd00c29b' 2022-08-17T12:32:13.3443878Z Submodule 'mkl-dnn' (https://github.com/intel/mkl-dnn.git) registered for path 'third_party/ideep/mkl-dnn' 2022-08-17T12:32:13.3471644Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep/mkl-dnn'... 2022-08-17T12:32:20.8189657Z Submodule path 'third_party/ideep/mkl-dnn': checked out '888a87a954e4fddb4d81fd10858eb834f2441b46' 2022-08-17T12:32:20.8209836Z Submodule 'third_party/oneDNN' (https://github.com/oneapi-src/oneDNN.git) registered for path 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-08-17T12:32:20.8237263Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep/mkl-dnn/third_party/oneDNN'... 2022-08-17T12:32:28.1542514Z Submodule path 'third_party/ideep/mkl-dnn/third_party/oneDNN': checked out '52b5f107dd9cf10910aaa19cb47f3abf9b349815' 2022-08-17T12:32:28.1657471Z Submodule path 'third_party/ios-cmake': checked out '8abaed637d56f1337d6e1d2c4026e25c1eade724' 2022-08-17T12:32:28.1822870Z Submodule path 'third_party/ittapi': checked out '5b8a7d7422611c3a0d799fb5fc5dd4abfae35b42' 2022-08-17T12:32:28.2946171Z Submodule path 'third_party/kineto': checked out '0703c78999061b8329dfab7ec5046fc5764a5573' 2022-08-17T12:32:28.2963884Z Submodule 'libkineto/third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/fmt' 2022-08-17T12:32:28.2967092Z Submodule 'libkineto/third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/googletest' 2022-08-17T12:32:28.2993558Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/fmt'... 2022-08-17T12:32:29.4403008Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/googletest'... 2022-08-17T12:32:30.4688188Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '2591ab91c3898c9f6544fff04660276537d32ffd' 2022-08-17T12:32:30.5328237Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '7aca84427f224eeed3144123d5230d5871e93347' 2022-08-17T12:32:30.5559993Z Submodule path 'third_party/nccl/nccl': checked out '19ab67d1727d337d10d0a48cbaf5cd119b8d88f1' 2022-08-17T12:32:30.5712549Z Submodule path 'third_party/neon2sse': checked out '97a126f08ce318023be604d03f88bf0820a9464a' 2022-08-17T12:32:30.7021552Z Submodule path 'third_party/nlohmann': checked out '87cda1d6646592ac5866dc703c8e1839046a6806' 2022-08-17T12:32:31.0170490Z Submodule path 'third_party/onnx': checked out 'f7ee1ac60d06abe8e26c9b6bbe1e3db5286b614b' 2022-08-17T12:32:31.0201379Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/onnx/third_party/benchmark' 2022-08-17T12:32:31.0204261Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx/third_party/pybind11' 2022-08-17T12:32:31.0233426Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/benchmark'... 2022-08-17T12:32:31.4343845Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/pybind11'... 2022-08-17T12:32:32.2895311Z Submodule path 'third_party/onnx/third_party/benchmark': checked out '0d98dba29d66e93259db7daa53a9327df767a415' 2022-08-17T12:32:32.3265025Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'ffa346860b306c9bbfb341aed9c14c067751feb8' 2022-08-17T12:32:32.3439425Z Submodule path 'third_party/onnx-tensorrt': checked out 'c153211418a7c57ce071d9ce2a41f8d1c85a878f' 2022-08-17T12:32:32.3456407Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx' 2022-08-17T12:32:32.3482777Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx'... 2022-08-17T12:32:34.1343313Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx': checked out '765f5ee823a67a866f4bd28a9860e81f3c811ce8' 2022-08-17T12:32:34.1364963Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-08-17T12:32:34.1367981Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-08-17T12:32:34.1395490Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark'... 2022-08-17T12:32:34.5478561Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11'... 2022-08-17T12:32:35.4200530Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark': checked out 'e776aa0275e293707b6a0901e0e8d8a8a3679508' 2022-08-17T12:32:35.4958493Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11': checked out 'a1041190c8b8ff0cd9e2f0752248ad5e3789ea0c' 2022-08-17T12:32:35.4975338Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-08-17T12:32:35.5002325Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang'... 2022-08-17T12:32:35.7439854Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2022-08-17T12:32:35.7541801Z Submodule path 'third_party/pocketfft': checked out 'ea778e37710c07723435b1be58235996d1d43a5a' 2022-08-17T12:32:36.0641562Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2022-08-17T12:32:36.0663548Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/protobuf/third_party/benchmark' 2022-08-17T12:32:36.0666962Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/protobuf/third_party/googletest' 2022-08-17T12:32:36.0694798Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/benchmark'... 2022-08-17T12:32:36.4925447Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/googletest'... 2022-08-17T12:32:37.4735947Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2022-08-17T12:32:37.5539887Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2022-08-17T12:32:37.5635695Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2022-08-17T12:32:37.5757867Z Submodule path 'third_party/pthreadpool': checked out 'a134dd5d4cee80cce15db81a72e7f929d71dd413' 2022-08-17T12:32:37.6154785Z Submodule path 'third_party/pybind11': checked out 'aa304c9c7d725ffb9d10af08a3b34cb372307020' 2022-08-17T12:32:37.6252877Z Submodule path 'third_party/python-enum': checked out '4cfedc426c4e2fc52e3f5c2b4297e15ed8d6b8c7' 2022-08-17T12:32:37.6581552Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2022-08-17T12:32:37.6685785Z Submodule path 'third_party/python-six': checked out '15e31431af97e5e64b80af0a3f598d382bcdd49a' 2022-08-17T12:32:37.7204022Z Submodule path 'third_party/sleef': checked out 'e0a003ee838b75d11763aa9c3ef17bf71a725bff' 2022-08-17T12:32:37.8565654Z Submodule path 'third_party/tbb': checked out 'a51a90bc609bb73db8ea13841b5cf7aa4344d4a9' 2022-08-17T12:32:37.8871611Z Submodule path 'third_party/tensorpipe': checked out '52791a2fd214b2a9dc5759d36725909c1daa7f2e' 2022-08-17T12:32:37.8889156Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/tensorpipe/third_party/googletest' 2022-08-17T12:32:37.8892969Z Submodule 'third_party/libnop' (https://github.com/google/libnop.git) registered for path 'third_party/tensorpipe/third_party/libnop' 2022-08-17T12:32:37.8896852Z Submodule 'third_party/libuv' (https://github.com/libuv/libuv.git) registered for path 'third_party/tensorpipe/third_party/libuv' 2022-08-17T12:32:37.8900945Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/tensorpipe/third_party/pybind11' 2022-08-17T12:32:37.8928322Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/googletest'... 2022-08-17T12:32:38.8718321Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libnop'... 2022-08-17T12:32:39.2813906Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libuv'... 2022-08-17T12:32:41.1956410Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11'... 2022-08-17T12:32:42.3412940Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2022-08-17T12:32:42.3583758Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2022-08-17T12:32:42.4363918Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '1dff88e5161cba5c59276d2070d2e304e4dcb242' 2022-08-17T12:32:42.4686044Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2022-08-17T12:32:42.4703001Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-08-17T12:32:42.4729351Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11/tools/clang'... 2022-08-17T12:32:42.6961457Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2022-08-17T12:32:42.8524873Z Submodule path 'third_party/zstd': checked out 'aec56a52fbab207fc639a1937d1e708a282edca8' 2022-08-17T12:32:42.8558119Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2022-08-17T12:32:42.8880851Z Entering 'android/libs/fbjni' 2022-08-17T12:32:42.8924471Z Entering 'third_party/FP16' 2022-08-17T12:32:42.8968449Z Entering 'third_party/FXdiv' 2022-08-17T12:32:42.9011753Z Entering 'third_party/NNPACK' 2022-08-17T12:32:42.9054069Z Entering 'third_party/QNNPACK' 2022-08-17T12:32:42.9096952Z Entering 'third_party/XNNPACK' 2022-08-17T12:32:42.9151072Z Entering 'third_party/benchmark' 2022-08-17T12:32:42.9193914Z Entering 'third_party/cpuinfo' 2022-08-17T12:32:42.9236321Z Entering 'third_party/cub' 2022-08-17T12:32:42.9278900Z Entering 'third_party/cudnn_frontend' 2022-08-17T12:32:42.9327475Z Entering 'third_party/eigen' 2022-08-17T12:32:42.9373682Z Entering 'third_party/fbgemm' 2022-08-17T12:32:42.9417170Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-08-17T12:32:42.9458943Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-08-17T12:32:42.9501457Z Entering 'third_party/fbgemm/third_party/googletest' 2022-08-17T12:32:42.9543556Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-08-17T12:32:42.9586049Z Entering 'third_party/flatbuffers' 2022-08-17T12:32:42.9630452Z Entering 'third_party/fmt' 2022-08-17T12:32:42.9672926Z Entering 'third_party/foxi' 2022-08-17T12:32:42.9715008Z Entering 'third_party/gemmlowp/gemmlowp' 2022-08-17T12:32:42.9757118Z Entering 'third_party/gloo' 2022-08-17T12:32:42.9800157Z Entering 'third_party/googletest' 2022-08-17T12:32:42.9843259Z Entering 'third_party/ideep' 2022-08-17T12:32:42.9884188Z Entering 'third_party/ideep/mkl-dnn' 2022-08-17T12:32:42.9927496Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-08-17T12:32:42.9977476Z Entering 'third_party/ios-cmake' 2022-08-17T12:32:43.0019925Z Entering 'third_party/ittapi' 2022-08-17T12:32:43.0061083Z Entering 'third_party/kineto' 2022-08-17T12:32:43.0103968Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-08-17T12:32:43.0144641Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-08-17T12:32:43.0187789Z Entering 'third_party/nccl/nccl' 2022-08-17T12:32:43.0230882Z Entering 'third_party/neon2sse' 2022-08-17T12:32:43.0273099Z Entering 'third_party/nlohmann' 2022-08-17T12:32:43.0316086Z Entering 'third_party/onnx' 2022-08-17T12:32:43.0370624Z Entering 'third_party/onnx/third_party/benchmark' 2022-08-17T12:32:43.0412432Z Entering 'third_party/onnx/third_party/pybind11' 2022-08-17T12:32:43.0458053Z Entering 'third_party/onnx-tensorrt' 2022-08-17T12:32:43.0499674Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-08-17T12:32:43.0547488Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-08-17T12:32:43.0590089Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-08-17T12:32:43.0631957Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-08-17T12:32:43.0678548Z Entering 'third_party/pocketfft' 2022-08-17T12:32:43.0720289Z Entering 'third_party/protobuf' 2022-08-17T12:32:43.0766254Z Entering 'third_party/protobuf/third_party/benchmark' 2022-08-17T12:32:43.0808091Z Entering 'third_party/protobuf/third_party/googletest' 2022-08-17T12:32:43.0851786Z Entering 'third_party/psimd' 2022-08-17T12:32:43.0894287Z Entering 'third_party/pthreadpool' 2022-08-17T12:32:43.0937247Z Entering 'third_party/pybind11' 2022-08-17T12:32:43.0979484Z Entering 'third_party/python-enum' 2022-08-17T12:32:43.1021943Z Entering 'third_party/python-peachpy' 2022-08-17T12:32:43.1064896Z Entering 'third_party/python-six' 2022-08-17T12:32:43.1106833Z Entering 'third_party/sleef' 2022-08-17T12:32:43.1149381Z Entering 'third_party/tbb' 2022-08-17T12:32:43.1194126Z Entering 'third_party/tensorpipe' 2022-08-17T12:32:43.1236393Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-08-17T12:32:43.1278354Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-08-17T12:32:43.1319650Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-08-17T12:32:43.1361272Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-08-17T12:32:43.1401570Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-08-17T12:32:43.1446110Z Entering 'third_party/zstd' 2022-08-17T12:32:43.1497362Z ##[endgroup] 2022-08-17T12:32:43.1499601Z ##[group]Persisting credentials for submodules 2022-08-17T12:32:43.1508051Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || : 2022-08-17T12:32:43.1817607Z Entering 'android/libs/fbjni' 2022-08-17T12:32:43.1858939Z Entering 'third_party/FP16' 2022-08-17T12:32:43.1900469Z Entering 'third_party/FXdiv' 2022-08-17T12:32:43.1943721Z Entering 'third_party/NNPACK' 2022-08-17T12:32:43.1986165Z Entering 'third_party/QNNPACK' 2022-08-17T12:32:43.2027199Z Entering 'third_party/XNNPACK' 2022-08-17T12:32:43.2079369Z Entering 'third_party/benchmark' 2022-08-17T12:32:43.2121414Z Entering 'third_party/cpuinfo' 2022-08-17T12:32:43.2163273Z Entering 'third_party/cub' 2022-08-17T12:32:43.2204677Z Entering 'third_party/cudnn_frontend' 2022-08-17T12:32:43.2252271Z Entering 'third_party/eigen' 2022-08-17T12:32:43.2295648Z Entering 'third_party/fbgemm' 2022-08-17T12:32:43.2336648Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-08-17T12:32:43.2379251Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-08-17T12:32:43.2419965Z Entering 'third_party/fbgemm/third_party/googletest' 2022-08-17T12:32:43.2461420Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-08-17T12:32:43.2503102Z Entering 'third_party/flatbuffers' 2022-08-17T12:32:43.2547367Z Entering 'third_party/fmt' 2022-08-17T12:32:43.2589714Z Entering 'third_party/foxi' 2022-08-17T12:32:43.2630572Z Entering 'third_party/gemmlowp/gemmlowp' 2022-08-17T12:32:43.2672282Z Entering 'third_party/gloo' 2022-08-17T12:32:43.2713584Z Entering 'third_party/googletest' 2022-08-17T12:32:43.2754656Z Entering 'third_party/ideep' 2022-08-17T12:32:43.2795106Z Entering 'third_party/ideep/mkl-dnn' 2022-08-17T12:32:43.2837370Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-08-17T12:32:43.2884556Z Entering 'third_party/ios-cmake' 2022-08-17T12:32:43.2925504Z Entering 'third_party/ittapi' 2022-08-17T12:32:43.2966811Z Entering 'third_party/kineto' 2022-08-17T12:32:43.3007916Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-08-17T12:32:43.3048740Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-08-17T12:32:43.3091601Z Entering 'third_party/nccl/nccl' 2022-08-17T12:32:43.3132116Z Entering 'third_party/neon2sse' 2022-08-17T12:32:43.3172416Z Entering 'third_party/nlohmann' 2022-08-17T12:32:43.3214391Z Entering 'third_party/onnx' 2022-08-17T12:32:43.3268359Z Entering 'third_party/onnx/third_party/benchmark' 2022-08-17T12:32:43.3309869Z Entering 'third_party/onnx/third_party/pybind11' 2022-08-17T12:32:43.3352973Z Entering 'third_party/onnx-tensorrt' 2022-08-17T12:32:43.3393759Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-08-17T12:32:43.3439810Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-08-17T12:32:43.3480491Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-08-17T12:32:43.3521252Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-08-17T12:32:43.3566648Z Entering 'third_party/pocketfft' 2022-08-17T12:32:43.3608040Z Entering 'third_party/protobuf' 2022-08-17T12:32:43.3653168Z Entering 'third_party/protobuf/third_party/benchmark' 2022-08-17T12:32:43.3695295Z Entering 'third_party/protobuf/third_party/googletest' 2022-08-17T12:32:43.3736958Z Entering 'third_party/psimd' 2022-08-17T12:32:43.3779067Z Entering 'third_party/pthreadpool' 2022-08-17T12:32:43.3819779Z Entering 'third_party/pybind11' 2022-08-17T12:32:43.3862680Z Entering 'third_party/python-enum' 2022-08-17T12:32:43.3904461Z Entering 'third_party/python-peachpy' 2022-08-17T12:32:43.3945183Z Entering 'third_party/python-six' 2022-08-17T12:32:43.3985570Z Entering 'third_party/sleef' 2022-08-17T12:32:43.4026490Z Entering 'third_party/tbb' 2022-08-17T12:32:43.4070429Z Entering 'third_party/tensorpipe' 2022-08-17T12:32:43.4111501Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-08-17T12:32:43.4152434Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-08-17T12:32:43.4194163Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-08-17T12:32:43.4235010Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-08-17T12:32:43.4275626Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-08-17T12:32:43.4319953Z Entering 'third_party/zstd' 2022-08-17T12:32:43.4374822Z [command]/usr/bin/git submodule foreach --recursive git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url 2022-08-17T12:32:43.4683372Z Entering 'android/libs/fbjni' 2022-08-17T12:32:43.4722004Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2022-08-17T12:32:43.4739001Z Entering 'third_party/FP16' 2022-08-17T12:32:43.4777682Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2022-08-17T12:32:43.4795592Z Entering 'third_party/FXdiv' 2022-08-17T12:32:43.4834634Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2022-08-17T12:32:43.4851901Z Entering 'third_party/NNPACK' 2022-08-17T12:32:43.4891306Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2022-08-17T12:32:43.4909076Z Entering 'third_party/QNNPACK' 2022-08-17T12:32:43.4947320Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/QNNPACK/config remote.origin.url 2022-08-17T12:32:43.4964266Z Entering 'third_party/XNNPACK' 2022-08-17T12:32:43.5002972Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2022-08-17T12:32:43.5031175Z Entering 'third_party/benchmark' 2022-08-17T12:32:43.5069663Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2022-08-17T12:32:43.5086571Z Entering 'third_party/cpuinfo' 2022-08-17T12:32:43.5124733Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2022-08-17T12:32:43.5142333Z Entering 'third_party/cub' 2022-08-17T12:32:43.5180718Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cub/config remote.origin.url 2022-08-17T12:32:43.5197453Z Entering 'third_party/cudnn_frontend' 2022-08-17T12:32:43.5235455Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2022-08-17T12:32:43.5258160Z Entering 'third_party/eigen' 2022-08-17T12:32:43.5296556Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/eigen/config remote.origin.url 2022-08-17T12:32:43.5316594Z Entering 'third_party/fbgemm' 2022-08-17T12:32:43.5354778Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2022-08-17T12:32:43.5371763Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-08-17T12:32:43.5409738Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/asmjit/config remote.origin.url 2022-08-17T12:32:43.5427147Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-08-17T12:32:43.5464967Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/cpuinfo/config remote.origin.url 2022-08-17T12:32:43.5482574Z Entering 'third_party/fbgemm/third_party/googletest' 2022-08-17T12:32:43.5521944Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/googletest/config remote.origin.url 2022-08-17T12:32:43.5539092Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-08-17T12:32:43.5576755Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/hipify_torch/config remote.origin.url 2022-08-17T12:32:43.5595095Z Entering 'third_party/flatbuffers' 2022-08-17T12:32:43.5633885Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2022-08-17T12:32:43.5652590Z Entering 'third_party/fmt' 2022-08-17T12:32:43.5691367Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2022-08-17T12:32:43.5708227Z Entering 'third_party/foxi' 2022-08-17T12:32:43.5749796Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/foxi/config remote.origin.url 2022-08-17T12:32:43.5766430Z Entering 'third_party/gemmlowp/gemmlowp' 2022-08-17T12:32:43.5805914Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2022-08-17T12:32:43.5822929Z Entering 'third_party/gloo' 2022-08-17T12:32:43.5861697Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2022-08-17T12:32:43.5878694Z Entering 'third_party/googletest' 2022-08-17T12:32:43.5917917Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2022-08-17T12:32:43.5934932Z Entering 'third_party/ideep' 2022-08-17T12:32:43.5973331Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2022-08-17T12:32:43.5989925Z Entering 'third_party/ideep/mkl-dnn' 2022-08-17T12:32:43.6028564Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2022-08-17T12:32:43.6047475Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-08-17T12:32:43.6085968Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/modules/third_party/oneDNN/config remote.origin.url 2022-08-17T12:32:43.6109598Z Entering 'third_party/ios-cmake' 2022-08-17T12:32:43.6147622Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ios-cmake/config remote.origin.url 2022-08-17T12:32:43.6164415Z Entering 'third_party/ittapi' 2022-08-17T12:32:43.6202323Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2022-08-17T12:32:43.6219278Z Entering 'third_party/kineto' 2022-08-17T12:32:43.6257492Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2022-08-17T12:32:43.6274878Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-08-17T12:32:43.6313196Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2022-08-17T12:32:43.6330670Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-08-17T12:32:43.6368877Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2022-08-17T12:32:43.6388059Z Entering 'third_party/nccl/nccl' 2022-08-17T12:32:43.6426154Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nccl/nccl/config remote.origin.url 2022-08-17T12:32:43.6443584Z Entering 'third_party/neon2sse' 2022-08-17T12:32:43.6482262Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/neon2sse/config remote.origin.url 2022-08-17T12:32:43.6499650Z Entering 'third_party/nlohmann' 2022-08-17T12:32:43.6537951Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2022-08-17T12:32:43.6556578Z Entering 'third_party/onnx' 2022-08-17T12:32:43.6594752Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2022-08-17T12:32:43.6623588Z Entering 'third_party/onnx/third_party/benchmark' 2022-08-17T12:32:43.6662072Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/benchmark/config remote.origin.url 2022-08-17T12:32:43.6679371Z Entering 'third_party/onnx/third_party/pybind11' 2022-08-17T12:32:43.6717846Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2022-08-17T12:32:43.6737005Z Entering 'third_party/onnx-tensorrt' 2022-08-17T12:32:43.6776397Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/config remote.origin.url 2022-08-17T12:32:43.6793284Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-08-17T12:32:43.6831951Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/config remote.origin.url 2022-08-17T12:32:43.6853959Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-08-17T12:32:43.6892537Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/modules/third_party/benchmark/config remote.origin.url 2022-08-17T12:32:43.6910354Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-08-17T12:32:43.6948829Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2022-08-17T12:32:43.6965316Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-08-17T12:32:43.7004698Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2022-08-17T12:32:43.7026253Z Entering 'third_party/pocketfft' 2022-08-17T12:32:43.7065470Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2022-08-17T12:32:43.7082127Z Entering 'third_party/protobuf' 2022-08-17T12:32:43.7120961Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2022-08-17T12:32:43.7141200Z Entering 'third_party/protobuf/third_party/benchmark' 2022-08-17T12:32:43.7180196Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2022-08-17T12:32:43.7197474Z Entering 'third_party/protobuf/third_party/googletest' 2022-08-17T12:32:43.7235528Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2022-08-17T12:32:43.7254090Z Entering 'third_party/psimd' 2022-08-17T12:32:43.7292727Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2022-08-17T12:32:43.7310627Z Entering 'third_party/pthreadpool' 2022-08-17T12:32:43.7348936Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2022-08-17T12:32:43.7365627Z Entering 'third_party/pybind11' 2022-08-17T12:32:43.7403792Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2022-08-17T12:32:43.7421107Z Entering 'third_party/python-enum' 2022-08-17T12:32:43.7460026Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-enum/config remote.origin.url 2022-08-17T12:32:43.7477687Z Entering 'third_party/python-peachpy' 2022-08-17T12:32:43.7516698Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2022-08-17T12:32:43.7533074Z Entering 'third_party/python-six' 2022-08-17T12:32:43.7571377Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-six/config remote.origin.url 2022-08-17T12:32:43.7588814Z Entering 'third_party/sleef' 2022-08-17T12:32:43.7627192Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2022-08-17T12:32:43.7644033Z Entering 'third_party/tbb' 2022-08-17T12:32:43.7682238Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tbb/config remote.origin.url 2022-08-17T12:32:43.7701482Z Entering 'third_party/tensorpipe' 2022-08-17T12:32:43.7740586Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2022-08-17T12:32:43.7758013Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-08-17T12:32:43.7795729Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2022-08-17T12:32:43.7812904Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-08-17T12:32:43.7851951Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2022-08-17T12:32:43.7869337Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-08-17T12:32:43.7907624Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2022-08-17T12:32:43.7924452Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-08-17T12:32:43.7962952Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2022-08-17T12:32:43.7979192Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-08-17T12:32:43.8017661Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2022-08-17T12:32:43.8037737Z Entering 'third_party/zstd' 2022-08-17T12:32:43.8076336Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/zstd/config remote.origin.url 2022-08-17T12:32:43.8891134Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2022-08-17T12:32:43.9201629Z Entering 'android/libs/fbjni' 2022-08-17T12:32:43.9243901Z Entering 'third_party/FP16' 2022-08-17T12:32:43.9286835Z Entering 'third_party/FXdiv' 2022-08-17T12:32:43.9329807Z Entering 'third_party/NNPACK' 2022-08-17T12:32:43.9373307Z Entering 'third_party/QNNPACK' 2022-08-17T12:32:43.9415613Z Entering 'third_party/XNNPACK' 2022-08-17T12:32:43.9469068Z Entering 'third_party/benchmark' 2022-08-17T12:32:43.9511955Z Entering 'third_party/cpuinfo' 2022-08-17T12:32:43.9555823Z Entering 'third_party/cub' 2022-08-17T12:32:43.9597616Z Entering 'third_party/cudnn_frontend' 2022-08-17T12:32:43.9644645Z Entering 'third_party/eigen' 2022-08-17T12:32:43.9690688Z Entering 'third_party/fbgemm' 2022-08-17T12:32:43.9733673Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-08-17T12:32:43.9776417Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-08-17T12:32:43.9818494Z Entering 'third_party/fbgemm/third_party/googletest' 2022-08-17T12:32:43.9860847Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-08-17T12:32:43.9904275Z Entering 'third_party/flatbuffers' 2022-08-17T12:32:43.9949227Z Entering 'third_party/fmt' 2022-08-17T12:32:43.9992119Z Entering 'third_party/foxi' 2022-08-17T12:32:44.0034361Z Entering 'third_party/gemmlowp/gemmlowp' 2022-08-17T12:32:44.0077981Z Entering 'third_party/gloo' 2022-08-17T12:32:44.0121381Z Entering 'third_party/googletest' 2022-08-17T12:32:44.0163927Z Entering 'third_party/ideep' 2022-08-17T12:32:44.0205722Z Entering 'third_party/ideep/mkl-dnn' 2022-08-17T12:32:44.0249840Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-08-17T12:32:44.0298364Z Entering 'third_party/ios-cmake' 2022-08-17T12:32:44.0341133Z Entering 'third_party/ittapi' 2022-08-17T12:32:44.0384280Z Entering 'third_party/kineto' 2022-08-17T12:32:44.0426005Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-08-17T12:32:44.0468041Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-08-17T12:32:44.0511850Z Entering 'third_party/nccl/nccl' 2022-08-17T12:32:44.0554077Z Entering 'third_party/neon2sse' 2022-08-17T12:32:44.0595731Z Entering 'third_party/nlohmann' 2022-08-17T12:32:44.0639632Z Entering 'third_party/onnx' 2022-08-17T12:32:44.0694975Z Entering 'third_party/onnx/third_party/benchmark' 2022-08-17T12:32:44.0738672Z Entering 'third_party/onnx/third_party/pybind11' 2022-08-17T12:32:44.0784466Z Entering 'third_party/onnx-tensorrt' 2022-08-17T12:32:44.0826799Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-08-17T12:32:44.0874598Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-08-17T12:32:44.0918154Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-08-17T12:32:44.0961201Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-08-17T12:32:44.1008454Z Entering 'third_party/pocketfft' 2022-08-17T12:32:44.1051553Z Entering 'third_party/protobuf' 2022-08-17T12:32:44.1097113Z Entering 'third_party/protobuf/third_party/benchmark' 2022-08-17T12:32:44.1138979Z Entering 'third_party/protobuf/third_party/googletest' 2022-08-17T12:32:44.1183048Z Entering 'third_party/psimd' 2022-08-17T12:32:44.1225764Z Entering 'third_party/pthreadpool' 2022-08-17T12:32:44.1267917Z Entering 'third_party/pybind11' 2022-08-17T12:32:44.1310743Z Entering 'third_party/python-enum' 2022-08-17T12:32:44.1352345Z Entering 'third_party/python-peachpy' 2022-08-17T12:32:44.1395506Z Entering 'third_party/python-six' 2022-08-17T12:32:44.1436891Z Entering 'third_party/sleef' 2022-08-17T12:32:44.1479816Z Entering 'third_party/tbb' 2022-08-17T12:32:44.1523846Z Entering 'third_party/tensorpipe' 2022-08-17T12:32:44.1566476Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-08-17T12:32:44.1609546Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-08-17T12:32:44.1651866Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-08-17T12:32:44.1693922Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-08-17T12:32:44.1735727Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-08-17T12:32:44.1781262Z Entering 'third_party/zstd' 2022-08-17T12:32:44.1837034Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2022-08-17T12:32:44.2148227Z Entering 'android/libs/fbjni' 2022-08-17T12:32:44.2190351Z Entering 'third_party/FP16' 2022-08-17T12:32:44.2232491Z Entering 'third_party/FXdiv' 2022-08-17T12:32:44.2274846Z Entering 'third_party/NNPACK' 2022-08-17T12:32:44.2317315Z Entering 'third_party/QNNPACK' 2022-08-17T12:32:44.2359370Z Entering 'third_party/XNNPACK' 2022-08-17T12:32:44.2411799Z Entering 'third_party/benchmark' 2022-08-17T12:32:44.2453821Z Entering 'third_party/cpuinfo' 2022-08-17T12:32:44.2496577Z Entering 'third_party/cub' 2022-08-17T12:32:44.2539828Z Entering 'third_party/cudnn_frontend' 2022-08-17T12:32:44.2588652Z Entering 'third_party/eigen' 2022-08-17T12:32:44.2633564Z Entering 'third_party/fbgemm' 2022-08-17T12:32:44.2676160Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-08-17T12:32:44.2718202Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-08-17T12:32:44.2760070Z Entering 'third_party/fbgemm/third_party/googletest' 2022-08-17T12:32:44.2802739Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-08-17T12:32:44.2846113Z Entering 'third_party/flatbuffers' 2022-08-17T12:32:44.2890224Z Entering 'third_party/fmt' 2022-08-17T12:32:44.2932791Z Entering 'third_party/foxi' 2022-08-17T12:32:44.2974388Z Entering 'third_party/gemmlowp/gemmlowp' 2022-08-17T12:32:44.3016804Z Entering 'third_party/gloo' 2022-08-17T12:32:44.3059307Z Entering 'third_party/googletest' 2022-08-17T12:32:44.3101801Z Entering 'third_party/ideep' 2022-08-17T12:32:44.3143663Z Entering 'third_party/ideep/mkl-dnn' 2022-08-17T12:32:44.3188151Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-08-17T12:32:44.3238158Z Entering 'third_party/ios-cmake' 2022-08-17T12:32:44.3279127Z Entering 'third_party/ittapi' 2022-08-17T12:32:44.3321076Z Entering 'third_party/kineto' 2022-08-17T12:32:44.3363418Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-08-17T12:32:44.3405793Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-08-17T12:32:44.3449711Z Entering 'third_party/nccl/nccl' 2022-08-17T12:32:44.3493051Z Entering 'third_party/neon2sse' 2022-08-17T12:32:44.3534311Z Entering 'third_party/nlohmann' 2022-08-17T12:32:44.3578213Z Entering 'third_party/onnx' 2022-08-17T12:32:44.3633640Z Entering 'third_party/onnx/third_party/benchmark' 2022-08-17T12:32:44.3675880Z Entering 'third_party/onnx/third_party/pybind11' 2022-08-17T12:32:44.3719268Z Entering 'third_party/onnx-tensorrt' 2022-08-17T12:32:44.3761482Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-08-17T12:32:44.3808238Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-08-17T12:32:44.3850571Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-08-17T12:32:44.3893736Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-08-17T12:32:44.3940563Z Entering 'third_party/pocketfft' 2022-08-17T12:32:44.3982640Z Entering 'third_party/protobuf' 2022-08-17T12:32:44.4028439Z Entering 'third_party/protobuf/third_party/benchmark' 2022-08-17T12:32:44.4070292Z Entering 'third_party/protobuf/third_party/googletest' 2022-08-17T12:32:44.4113802Z Entering 'third_party/psimd' 2022-08-17T12:32:44.4155504Z Entering 'third_party/pthreadpool' 2022-08-17T12:32:44.4197121Z Entering 'third_party/pybind11' 2022-08-17T12:32:44.4238474Z Entering 'third_party/python-enum' 2022-08-17T12:32:44.4280259Z Entering 'third_party/python-peachpy' 2022-08-17T12:32:44.4322127Z Entering 'third_party/python-six' 2022-08-17T12:32:44.4364287Z Entering 'third_party/sleef' 2022-08-17T12:32:44.4406995Z Entering 'third_party/tbb' 2022-08-17T12:32:44.4450511Z Entering 'third_party/tensorpipe' 2022-08-17T12:32:44.4493788Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-08-17T12:32:44.4536734Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-08-17T12:32:44.4578880Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-08-17T12:32:44.4620884Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-08-17T12:32:44.4662182Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-08-17T12:32:44.4706605Z Entering 'third_party/zstd' 2022-08-17T12:32:44.4757885Z ##[endgroup] 2022-08-17T12:32:44.4801642Z [command]/usr/bin/git log -1 --format='%H' 2022-08-17T12:32:44.4831247Z 'ce6a3c605df99d1df57c0dda75c06d748e54ed2a' 2022-08-17T12:32:44.4979110Z Prepare all required actions 2022-08-17T12:32:44.5014424Z ##[group]Run ./.github/actions/setup-linux 2022-08-17T12:32:44.5014781Z env: 2022-08-17T12:32:44.5015139Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:32:44.5015414Z ##[endgroup] 2022-08-17T12:32:44.5075277Z ##[group]Run set -euo pipefail 2022-08-17T12:32:44.5075605Z set -euo pipefail 2022-08-17T12:32:44.5075904Z function get_ec2_metadata() { 2022-08-17T12:32:44.5076253Z  # Pulled from instance metadata endpoint for EC2 2022-08-17T12:32:44.5076733Z  # see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html 2022-08-17T12:32:44.5077152Z  category=$1 2022-08-17T12:32:44.5077496Z  curl -fsSL "http://169.254.169.254/latest/meta-data/${category}" 2022-08-17T12:32:44.5077805Z } 2022-08-17T12:32:44.5078096Z echo "ami-id: $(get_ec2_metadata ami-id)" 2022-08-17T12:32:44.5078494Z echo "instance-id: $(get_ec2_metadata instance-id)" 2022-08-17T12:32:44.5078880Z echo "instance-type: $(get_ec2_metadata instance-type)" 2022-08-17T12:32:44.5079233Z echo "system info $(uname -a)" 2022-08-17T12:32:44.5091862Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-08-17T12:32:44.5092141Z env: 2022-08-17T12:32:44.5092378Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:32:44.5092636Z ##[endgroup] 2022-08-17T12:32:44.5193614Z ami-id: ami-096198a0bccc6bad4 2022-08-17T12:32:44.5255642Z instance-id: i-02fdd1ace63d4e018 2022-08-17T12:32:44.5315573Z instance-type: g3.8xlarge 2022-08-17T12:32:44.5324034Z system info Linux ip-10-0-4-249.ec2.internal 4.14.252-195.483.amzn2.x86_64 #1 SMP Mon Nov 1 20:58:46 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux 2022-08-17T12:32:44.5344104Z ##[group]Run if systemctl is-active --quiet docker; then 2022-08-17T12:32:44.5344611Z if systemctl is-active --quiet docker; then 2022-08-17T12:32:44.5344949Z  echo "Docker daemon is running..."; 2022-08-17T12:32:44.5345213Z else 2022-08-17T12:32:44.5345535Z  echo "Starting docker deamon..." && sudo systemctl start docker; 2022-08-17T12:32:44.5345843Z fi 2022-08-17T12:32:44.5357084Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-08-17T12:32:44.5357375Z env: 2022-08-17T12:32:44.5357614Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:32:44.5357854Z ##[endgroup] 2022-08-17T12:32:44.5408016Z Docker daemon is running... 2022-08-17T12:32:44.5427706Z ##[group]Run AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\") 2022-08-17T12:32:44.5428169Z AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\") 2022-08-17T12:32:44.5428695Z retry () { "$@" || (sleep 1 && "$@") || (sleep 2 && "$@") } 2022-08-17T12:32:44.5429243Z retry aws ecr get-login*** "$AWS_DEFAULT_REGION" | docker login --username AWS \ 2022-08-17T12:32:44.5429699Z  --password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com" 2022-08-17T12:32:44.5440618Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-08-17T12:32:44.5440909Z env: 2022-08-17T12:32:44.5441133Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:32:44.5441399Z AWS_RETRY_MODE: standard 2022-08-17T12:32:44.5441653Z AWS_MAX_ATTEMPTS: 5 2022-08-17T12:32:44.5441906Z AWS_DEFAULT_REGION: us-east-1 2022-08-17T12:32:44.5442167Z ##[endgroup] 2022-08-17T12:32:45.5047156Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2022-08-17T12:32:45.5047618Z Configure a credential helper to remove this warning. See 2022-08-17T12:32:45.5048221Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2022-08-17T12:32:45.5048547Z 2022-08-17T12:32:45.5050128Z Login Succeeded 2022-08-17T12:32:45.5126141Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2022-08-17T12:32:45.5126546Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2022-08-17T12:32:45.5127028Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2022-08-17T12:32:45.5139705Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-08-17T12:32:45.5140000Z env: 2022-08-17T12:32:45.5140239Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:32:45.5140481Z ##[endgroup] 2022-08-17T12:32:45.5213818Z Prepare all required actions 2022-08-17T12:32:45.5214189Z Getting action download info 2022-08-17T12:32:45.6675531Z Download action repository 'seemethere/add-github-ssh-key@v1' (SHA:105f7619adc4054f5f1be5f79ebd354d82384638) 2022-08-17T12:32:45.7919360Z ##[group]Run ./.github/actions/setup-ssh 2022-08-17T12:32:45.7919628Z with: 2022-08-17T12:32:45.7920077Z github-secret: *** 2022-08-17T12:32:45.7920326Z env: 2022-08-17T12:32:45.7920559Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:32:45.7920797Z ##[endgroup] 2022-08-17T12:32:45.7947474Z ##[group]Run seemethere/add-github-ssh-key@v1 2022-08-17T12:32:45.7947762Z with: 2022-08-17T12:32:45.7948141Z GITHUB_TOKEN: *** 2022-08-17T12:32:45.7948394Z activate-with-label: false 2022-08-17T12:32:45.7948672Z label: with-ssh 2022-08-17T12:32:45.7948930Z remove-existing-keys: true 2022-08-17T12:32:45.7949159Z env: 2022-08-17T12:32:45.7949388Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:32:45.7949644Z ##[endgroup] 2022-08-17T12:32:46.2109523Z Grabbing public ssh keys from https://github.com/IvanYashchuk.keys 2022-08-17T12:32:46.2958360Z ~/.ssh/authorized_keys file found on node, removing ~/.ssh and starting fresh 2022-08-17T12:32:46.2978289Z Public keys pulled and installed to /home/ec2-user/.ssh/authorized_keys 2022-08-17T12:32:46.3014926Z Login using: ssh ec2-user@ec2-35-174-176-34.compute-1.amazonaws.com 2022-08-17T12:32:46.3073692Z Prepare all required actions 2022-08-17T12:32:46.3098209Z ##[group]Run ./.github/actions/pull-docker-image 2022-08-17T12:32:46.3098498Z with: 2022-08-17T12:32:46.3098979Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:a347f7e7645f04fc68e4f87c73cf0385233153b8 2022-08-17T12:32:46.3099422Z env: 2022-08-17T12:32:46.3099668Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:32:46.3099929Z ##[endgroup] 2022-08-17T12:32:46.3117652Z ##[group]Run retry () { "$@" || (sleep 1 && "$@") || (sleep 2 && "$@") } 2022-08-17T12:32:46.3118022Z retry () { "$@" || (sleep 1 && "$@") || (sleep 2 && "$@") } 2022-08-17T12:32:46.3118386Z # ignore output since only exit code is used for conditional 2022-08-17T12:32:46.3118751Z # only pull docker image if it's not available locally 2022-08-17T12:32:46.3119153Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2022-08-17T12:32:46.3119561Z  retry docker pull "${DOCKER_IMAGE}" 2022-08-17T12:32:46.3119814Z fi 2022-08-17T12:32:46.3132220Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-08-17T12:32:46.3132513Z env: 2022-08-17T12:32:46.3132748Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:32:46.3133233Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:a347f7e7645f04fc68e4f87c73cf0385233153b8 2022-08-17T12:32:46.3133710Z ##[endgroup] 2022-08-17T12:32:46.5524903Z a347f7e7645f04fc68e4f87c73cf0385233153b8: Pulling from pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7 2022-08-17T12:32:46.5525682Z 40dd5be53814: Pulling fs layer 2022-08-17T12:32:46.5525984Z bd44602516a4: Pulling fs layer 2022-08-17T12:32:46.5526247Z 8ebfb31ea67d: Pulling fs layer 2022-08-17T12:32:46.5526527Z 1589dc294916: Pulling fs layer 2022-08-17T12:32:46.5526805Z 2c3a764ff1ef: Pulling fs layer 2022-08-17T12:32:46.5527194Z 2fb24fb5f7cb: Pulling fs layer 2022-08-17T12:32:46.5527681Z d6e4b45751c9: Pulling fs layer 2022-08-17T12:32:46.5527998Z 98a26bc0781e: Pulling fs layer 2022-08-17T12:32:46.5528284Z 07c42b0591b2: Pulling fs layer 2022-08-17T12:32:46.5528535Z e956f9382b6b: Pulling fs layer 2022-08-17T12:32:46.5528803Z 18e4bf4a1899: Pulling fs layer 2022-08-17T12:32:46.5529169Z 2e31a77c012f: Pulling fs layer 2022-08-17T12:32:46.5529622Z b5d8c31450cf: Pulling fs layer 2022-08-17T12:32:46.5530368Z 88f9a97427bd: Pulling fs layer 2022-08-17T12:32:46.5530838Z 98a26bc0781e: Waiting 2022-08-17T12:32:46.5531184Z 7e28e15c116b: Pulling fs layer 2022-08-17T12:32:46.5531463Z 2f81fa8a1d7c: Pulling fs layer 2022-08-17T12:32:46.5531724Z 1589dc294916: Waiting 2022-08-17T12:32:46.5531965Z 7b5e6a6a5c95: Pulling fs layer 2022-08-17T12:32:46.5532224Z 2fb24fb5f7cb: Waiting 2022-08-17T12:32:46.5532666Z 610cdf67f481: Pulling fs layer 2022-08-17T12:32:46.5532984Z b6d562a0f7ff: Pulling fs layer 2022-08-17T12:32:46.5533256Z a2f34566b9ec: Pulling fs layer 2022-08-17T12:32:46.5533534Z 6b8b98f37e72: Pulling fs layer 2022-08-17T12:32:46.5533776Z 610cdf67f481: Waiting 2022-08-17T12:32:46.5534014Z b5d8c31450cf: Waiting 2022-08-17T12:32:46.5534321Z f3a84a6caa8a: Pulling fs layer 2022-08-17T12:32:46.5534597Z 12e2fbb7e83d: Pulling fs layer 2022-08-17T12:32:46.5534836Z 2e31a77c012f: Waiting 2022-08-17T12:32:46.5535077Z 07c42b0591b2: Waiting 2022-08-17T12:32:46.5535316Z 7e28e15c116b: Waiting 2022-08-17T12:32:46.5535564Z 68b4bfa2dd25: Pulling fs layer 2022-08-17T12:32:46.5535824Z 2f81fa8a1d7c: Waiting 2022-08-17T12:32:46.5536080Z a525f524d1a2: Pulling fs layer 2022-08-17T12:32:46.5536316Z 6b8b98f37e72: Waiting 2022-08-17T12:32:46.5536552Z 88f9a97427bd: Waiting 2022-08-17T12:32:46.5536788Z e956f9382b6b: Waiting 2022-08-17T12:32:46.5537021Z d0ba91206015: Pulling fs layer 2022-08-17T12:32:46.5537275Z a2f34566b9ec: Waiting 2022-08-17T12:32:46.5537528Z a501bf0e4821: Pulling fs layer 2022-08-17T12:32:46.5537761Z a525f524d1a2: Waiting 2022-08-17T12:32:46.5537996Z d0ba91206015: Waiting 2022-08-17T12:32:46.5538371Z 1ab1fac16203: Pulling fs layer 2022-08-17T12:32:46.5538643Z 4a71cd45bf40: Pulling fs layer 2022-08-17T12:32:46.5538910Z 7671cd4418ff: Pulling fs layer 2022-08-17T12:32:46.5539178Z a10b036b1ac1: Pulling fs layer 2022-08-17T12:32:46.5539429Z 97187c0d01c8: Pulling fs layer 2022-08-17T12:32:46.5539701Z e01fb7501ddc: Pulling fs layer 2022-08-17T12:32:46.5539955Z 7b5e6a6a5c95: Waiting 2022-08-17T12:32:46.5540195Z 69972009b107: Pulling fs layer 2022-08-17T12:32:46.5540445Z 4a71cd45bf40: Waiting 2022-08-17T12:32:46.5540684Z 7671cd4418ff: Waiting 2022-08-17T12:32:46.5540905Z a10b036b1ac1: Waiting 2022-08-17T12:32:46.5541142Z 97187c0d01c8: Waiting 2022-08-17T12:32:46.5541380Z e01fb7501ddc: Waiting 2022-08-17T12:32:46.5541640Z b1bbb9a03a84: Pulling fs layer 2022-08-17T12:32:46.5541893Z f622f4e8d4ab: Pulling fs layer 2022-08-17T12:32:46.5542157Z 1a39833a4384: Pulling fs layer 2022-08-17T12:32:46.5542409Z 1ab1fac16203: Waiting 2022-08-17T12:32:46.5542649Z a44805fd2007: Pulling fs layer 2022-08-17T12:32:46.5542910Z b1bbb9a03a84: Waiting 2022-08-17T12:32:46.5543166Z 2c313266d2f7: Pulling fs layer 2022-08-17T12:32:46.5543625Z f622f4e8d4ab: Waiting 2022-08-17T12:32:46.5543886Z d18c1a925d74: Pulling fs layer 2022-08-17T12:32:46.5544151Z 7b1d128ef534: Pulling fs layer 2022-08-17T12:32:46.5544405Z bffcf72d25f8: Pulling fs layer 2022-08-17T12:32:46.5544678Z 2be8d647e7ed: Pulling fs layer 2022-08-17T12:32:46.5545137Z 6dd85d7a52f9: Pulling fs layer 2022-08-17T12:32:46.5545435Z 0784f310567d: Pulling fs layer 2022-08-17T12:32:46.5545703Z bf753a2f1ade: Pulling fs layer 2022-08-17T12:32:46.5545971Z 3d29501b932f: Pulling fs layer 2022-08-17T12:32:46.5546201Z 1a39833a4384: Waiting 2022-08-17T12:32:46.5546455Z d71a3181060a: Pulling fs layer 2022-08-17T12:32:46.5546702Z a44805fd2007: Waiting 2022-08-17T12:32:46.5546924Z 2be8d647e7ed: Waiting 2022-08-17T12:32:46.5547171Z bffcf72d25f8: Waiting 2022-08-17T12:32:46.5547414Z 7b1d128ef534: Waiting 2022-08-17T12:32:46.5547635Z bf753a2f1ade: Waiting 2022-08-17T12:32:46.5547871Z 0784f310567d: Waiting 2022-08-17T12:32:46.5548126Z 415d573a426a: Pulling fs layer 2022-08-17T12:32:46.5548378Z 4f6ed058a00d: Pulling fs layer 2022-08-17T12:32:46.5548649Z 2e8c8ad9734d: Pulling fs layer 2022-08-17T12:32:46.5548903Z d18c1a925d74: Waiting 2022-08-17T12:32:46.5549137Z 458a1b68f724: Pulling fs layer 2022-08-17T12:32:46.5549401Z 415d573a426a: Waiting 2022-08-17T12:32:46.5549640Z 4f6ed058a00d: Waiting 2022-08-17T12:32:46.5549968Z 458a1b68f724: Waiting 2022-08-17T12:32:46.5550200Z 3d29501b932f: Waiting 2022-08-17T12:32:46.5550433Z 6dd85d7a52f9: Waiting 2022-08-17T12:32:46.5550651Z d71a3181060a: Waiting 2022-08-17T12:32:46.5550886Z 2e8c8ad9734d: Waiting 2022-08-17T12:32:46.6847379Z bd44602516a4: Verifying Checksum 2022-08-17T12:32:46.6847684Z bd44602516a4: Download complete 2022-08-17T12:32:46.7819967Z 1589dc294916: Verifying Checksum 2022-08-17T12:32:46.7820280Z 1589dc294916: Download complete 2022-08-17T12:32:46.8465881Z 8ebfb31ea67d: Verifying Checksum 2022-08-17T12:32:46.8466515Z 8ebfb31ea67d: Download complete 2022-08-17T12:32:46.8480604Z 2c3a764ff1ef: Download complete 2022-08-17T12:32:46.8769830Z 40dd5be53814: Verifying Checksum 2022-08-17T12:32:46.8770135Z 40dd5be53814: Download complete 2022-08-17T12:32:46.9402222Z d6e4b45751c9: Verifying Checksum 2022-08-17T12:32:46.9402575Z d6e4b45751c9: Download complete 2022-08-17T12:32:47.0211410Z 07c42b0591b2: Download complete 2022-08-17T12:32:47.1047730Z e956f9382b6b: Verifying Checksum 2022-08-17T12:32:47.1048374Z e956f9382b6b: Download complete 2022-08-17T12:32:47.6070552Z 40dd5be53814: Pull complete 2022-08-17T12:32:47.8749303Z bd44602516a4: Pull complete 2022-08-17T12:32:48.4148802Z 8ebfb31ea67d: Pull complete 2022-08-17T12:32:48.5519956Z 1589dc294916: Pull complete 2022-08-17T12:32:48.6580427Z 2c3a764ff1ef: Pull complete 2022-08-17T12:32:49.3024073Z 18e4bf4a1899: Verifying Checksum 2022-08-17T12:32:49.3025208Z 18e4bf4a1899: Download complete 2022-08-17T12:32:49.3832077Z 2e31a77c012f: Verifying Checksum 2022-08-17T12:32:49.3832371Z 2e31a77c012f: Download complete 2022-08-17T12:32:49.4703928Z b5d8c31450cf: Download complete 2022-08-17T12:32:49.5550601Z 88f9a97427bd: Verifying Checksum 2022-08-17T12:32:49.5551126Z 88f9a97427bd: Download complete 2022-08-17T12:32:50.3134696Z 7e28e15c116b: Verifying Checksum 2022-08-17T12:32:50.3135296Z 7e28e15c116b: Download complete 2022-08-17T12:32:50.4169053Z 2f81fa8a1d7c: Verifying Checksum 2022-08-17T12:32:50.4169641Z 2f81fa8a1d7c: Download complete 2022-08-17T12:32:50.5585790Z 7b5e6a6a5c95: Verifying Checksum 2022-08-17T12:32:50.5587063Z 7b5e6a6a5c95: Download complete 2022-08-17T12:32:58.0567948Z 2fb24fb5f7cb: Download complete 2022-08-17T12:32:58.1449914Z b6d562a0f7ff: Verifying Checksum 2022-08-17T12:32:58.1450315Z b6d562a0f7ff: Download complete 2022-08-17T12:32:58.2239146Z a2f34566b9ec: Verifying Checksum 2022-08-17T12:32:58.2239729Z a2f34566b9ec: Download complete 2022-08-17T12:32:58.3060053Z 6b8b98f37e72: Verifying Checksum 2022-08-17T12:32:58.3060648Z 6b8b98f37e72: Download complete 2022-08-17T12:32:58.3851532Z f3a84a6caa8a: Download complete 2022-08-17T12:32:58.4799646Z 12e2fbb7e83d: Verifying Checksum 2022-08-17T12:32:58.4800365Z 12e2fbb7e83d: Download complete 2022-08-17T12:32:58.5692331Z 68b4bfa2dd25: Download complete 2022-08-17T12:32:59.5139561Z a525f524d1a2: Verifying Checksum 2022-08-17T12:32:59.5140163Z a525f524d1a2: Download complete 2022-08-17T12:32:59.5890044Z d0ba91206015: Verifying Checksum 2022-08-17T12:32:59.5890724Z d0ba91206015: Download complete 2022-08-17T12:32:59.6683292Z a501bf0e4821: Verifying Checksum 2022-08-17T12:32:59.6683903Z a501bf0e4821: Download complete 2022-08-17T12:32:59.7481815Z 1ab1fac16203: Verifying Checksum 2022-08-17T12:32:59.7482179Z 1ab1fac16203: Download complete 2022-08-17T12:32:59.8505777Z 4a71cd45bf40: Verifying Checksum 2022-08-17T12:32:59.8506136Z 4a71cd45bf40: Download complete 2022-08-17T12:32:59.9274700Z 7671cd4418ff: Verifying Checksum 2022-08-17T12:32:59.9275204Z 7671cd4418ff: Download complete 2022-08-17T12:33:01.2453352Z 98a26bc0781e: Verifying Checksum 2022-08-17T12:33:01.2453994Z 98a26bc0781e: Download complete 2022-08-17T12:33:01.3341069Z 97187c0d01c8: Verifying Checksum 2022-08-17T12:33:01.3341681Z 97187c0d01c8: Download complete 2022-08-17T12:33:01.4407743Z e01fb7501ddc: Verifying Checksum 2022-08-17T12:33:01.4408368Z e01fb7501ddc: Download complete 2022-08-17T12:33:01.7256591Z 69972009b107: Verifying Checksum 2022-08-17T12:33:01.7257195Z 69972009b107: Download complete 2022-08-17T12:33:01.8172374Z b1bbb9a03a84: Verifying Checksum 2022-08-17T12:33:01.8172988Z b1bbb9a03a84: Download complete 2022-08-17T12:33:01.9221724Z a10b036b1ac1: Verifying Checksum 2022-08-17T12:33:01.9222273Z a10b036b1ac1: Download complete 2022-08-17T12:33:01.9333253Z f622f4e8d4ab: Verifying Checksum 2022-08-17T12:33:01.9333579Z f622f4e8d4ab: Download complete 2022-08-17T12:33:02.0116870Z a44805fd2007: Verifying Checksum 2022-08-17T12:33:02.0117221Z a44805fd2007: Download complete 2022-08-17T12:33:02.1922282Z 1a39833a4384: Verifying Checksum 2022-08-17T12:33:02.1922869Z 1a39833a4384: Download complete 2022-08-17T12:33:02.2606876Z d18c1a925d74: Verifying Checksum 2022-08-17T12:33:02.2607468Z d18c1a925d74: Download complete 2022-08-17T12:33:02.3398342Z 7b1d128ef534: Verifying Checksum 2022-08-17T12:33:02.3398964Z 7b1d128ef534: Download complete 2022-08-17T12:33:02.4676660Z 2c313266d2f7: Verifying Checksum 2022-08-17T12:33:02.4677009Z 2c313266d2f7: Download complete 2022-08-17T12:33:02.5629146Z 2be8d647e7ed: Verifying Checksum 2022-08-17T12:33:02.5629527Z 2be8d647e7ed: Download complete 2022-08-17T12:33:02.6330743Z 6dd85d7a52f9: Verifying Checksum 2022-08-17T12:33:02.6331193Z 6dd85d7a52f9: Download complete 2022-08-17T12:33:02.7233309Z 0784f310567d: Verifying Checksum 2022-08-17T12:33:02.7233914Z 0784f310567d: Download complete 2022-08-17T12:33:02.8184875Z bf753a2f1ade: Download complete 2022-08-17T12:33:03.0093574Z 3d29501b932f: Verifying Checksum 2022-08-17T12:33:03.0093944Z 3d29501b932f: Download complete 2022-08-17T12:33:03.0988018Z d71a3181060a: Download complete 2022-08-17T12:33:03.7006214Z 415d573a426a: Verifying Checksum 2022-08-17T12:33:03.7006827Z 415d573a426a: Download complete 2022-08-17T12:33:03.7883602Z 4f6ed058a00d: Verifying Checksum 2022-08-17T12:33:03.7884220Z 4f6ed058a00d: Download complete 2022-08-17T12:33:05.4175453Z bffcf72d25f8: Verifying Checksum 2022-08-17T12:33:05.4175817Z bffcf72d25f8: Download complete 2022-08-17T12:33:05.5061202Z 458a1b68f724: Verifying Checksum 2022-08-17T12:33:05.5061771Z 458a1b68f724: Download complete 2022-08-17T12:33:11.7792044Z 2fb24fb5f7cb: Pull complete 2022-08-17T12:33:11.9043512Z d6e4b45751c9: Pull complete 2022-08-17T12:33:14.3406447Z 610cdf67f481: Verifying Checksum 2022-08-17T12:33:14.3406829Z 610cdf67f481: Download complete 2022-08-17T12:33:33.7253582Z 98a26bc0781e: Pull complete 2022-08-17T12:33:33.7855531Z 2e8c8ad9734d: Verifying Checksum 2022-08-17T12:33:33.7855879Z 2e8c8ad9734d: Download complete 2022-08-17T12:33:35.5723163Z 07c42b0591b2: Pull complete 2022-08-17T12:33:37.4202744Z e956f9382b6b: Pull complete 2022-08-17T12:33:44.8337464Z 18e4bf4a1899: Pull complete 2022-08-17T12:33:46.6808576Z 2e31a77c012f: Pull complete 2022-08-17T12:33:48.5286083Z b5d8c31450cf: Pull complete 2022-08-17T12:33:50.4074523Z 88f9a97427bd: Pull complete 2022-08-17T12:33:54.9569806Z 7e28e15c116b: Pull complete 2022-08-17T12:33:56.8331662Z 2f81fa8a1d7c: Pull complete 2022-08-17T12:33:58.6809285Z 7b5e6a6a5c95: Pull complete 2022-08-17T12:34:33.6508175Z 610cdf67f481: Pull complete 2022-08-17T12:34:33.7729556Z b6d562a0f7ff: Pull complete 2022-08-17T12:34:33.8673786Z a2f34566b9ec: Pull complete 2022-08-17T12:34:33.9639485Z 6b8b98f37e72: Pull complete 2022-08-17T12:34:34.0753870Z f3a84a6caa8a: Pull complete 2022-08-17T12:34:34.1717694Z 12e2fbb7e83d: Pull complete 2022-08-17T12:34:34.2670702Z 68b4bfa2dd25: Pull complete 2022-08-17T12:34:37.8602022Z a525f524d1a2: Pull complete 2022-08-17T12:34:39.7008318Z d0ba91206015: Pull complete 2022-08-17T12:34:41.5362738Z a501bf0e4821: Pull complete 2022-08-17T12:34:43.5419075Z 1ab1fac16203: Pull complete 2022-08-17T12:34:45.3364365Z 4a71cd45bf40: Pull complete 2022-08-17T12:34:46.8419217Z 7671cd4418ff: Pull complete 2022-08-17T12:34:55.5513505Z a10b036b1ac1: Pull complete 2022-08-17T12:34:58.0057936Z 97187c0d01c8: Pull complete 2022-08-17T12:35:00.8251383Z e01fb7501ddc: Pull complete 2022-08-17T12:35:04.2041563Z 69972009b107: Pull complete 2022-08-17T12:35:06.7073141Z b1bbb9a03a84: Pull complete 2022-08-17T12:35:10.0110526Z f622f4e8d4ab: Pull complete 2022-08-17T12:35:12.2285283Z 1a39833a4384: Pull complete 2022-08-17T12:35:14.1539322Z a44805fd2007: Pull complete 2022-08-17T12:35:17.4686114Z 2c313266d2f7: Pull complete 2022-08-17T12:35:19.5329678Z d18c1a925d74: Pull complete 2022-08-17T12:35:21.8723348Z 7b1d128ef534: Pull complete 2022-08-17T12:35:28.6643282Z bffcf72d25f8: Pull complete 2022-08-17T12:35:30.5095607Z 2be8d647e7ed: Pull complete 2022-08-17T12:35:32.3505819Z 6dd85d7a52f9: Pull complete 2022-08-17T12:35:34.4233780Z 0784f310567d: Pull complete 2022-08-17T12:35:36.2706381Z bf753a2f1ade: Pull complete 2022-08-17T12:35:37.8707177Z 3d29501b932f: Pull complete 2022-08-17T12:35:37.9676921Z d71a3181060a: Pull complete 2022-08-17T12:35:39.8622990Z 415d573a426a: Pull complete 2022-08-17T12:35:39.9666049Z 4f6ed058a00d: Pull complete 2022-08-17T12:36:21.1132869Z 2e8c8ad9734d: Pull complete 2022-08-17T12:36:22.9614201Z 458a1b68f724: Pull complete 2022-08-17T12:36:24.2549453Z Digest: sha256:490eaf5744f7cb16e853fb3063447df07750afa5d8ef966c98f5802528aaa7d2 2022-08-17T12:36:24.7559973Z Status: Downloaded newer image for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:a347f7e7645f04fc68e4f87c73cf0385233153b8 2022-08-17T12:36:25.0380494Z 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:a347f7e7645f04fc68e4f87c73cf0385233153b8 2022-08-17T12:36:25.0462918Z ##[group]Run nick-fields/retry@71062288b76e2b6214ebde0e673ce0de1755740a 2022-08-17T12:36:25.0463877Z with: 2022-08-17T12:36:25.0464298Z timeout_minutes: 10 2022-08-17T12:36:25.0464542Z max_attempts: 3 2022-08-17T12:36:25.0464931Z command: set -ex bash .github/scripts/install_nvidia_utils_linux.sh echo "GPU_FLAG=--gpus all" >> "${GITHUB_ENV}" 2022-08-17T12:36:25.0465316Z retry_wait_seconds: 10 2022-08-17T12:36:25.0465570Z polling_interval_seconds: 1 2022-08-17T12:36:25.0465842Z warning_on_retry: true 2022-08-17T12:36:25.0466104Z continue_on_error: false 2022-08-17T12:36:25.0466326Z env: 2022-08-17T12:36:25.0466580Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:36:25.0466838Z ##[endgroup] 2022-08-17T12:36:25.0900953Z 2022-08-17T12:36:25.0973555Z == Installing nvidia container toolkit for amzn2 == 2022-08-17T12:36:25.0976303Z + bash .github/scripts/install_nvidia_utils_linux.sh 2022-08-17T12:36:25.0976775Z + sudo yum install -y yum-utils 2022-08-17T12:36:25.5649686Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-08-17T12:36:25.9087561Z Package yum-utils-1.1.31-46.amzn2.0.1.noarch already installed and latest version 2022-08-17T12:36:25.9087999Z Nothing to do 2022-08-17T12:36:25.9284169Z + sudo yum-config-manager --add-repo https://nvidia.github.io/nvidia-docker/amzn2/nvidia-docker.repo 2022-08-17T12:36:26.5470145Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-08-17T12:36:26.5762136Z adding repo from: https://nvidia.github.io/nvidia-docker/amzn2/nvidia-docker.repo 2022-08-17T12:36:26.5762905Z grabbing file https://nvidia.github.io/nvidia-docker/amzn2/nvidia-docker.repo to /etc/yum.repos.d/nvidia-docker.repo 2022-08-17T12:36:26.5763483Z repo saved to /etc/yum.repos.d/nvidia-docker.repo 2022-08-17T12:36:26.5907181Z + sudo yum install -y nvidia-docker2 2022-08-17T12:36:27.0528522Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-08-17T12:36:27.0938719Z Retrieving key from https://nvidia.github.io/libnvidia-container/gpgkey 2022-08-17T12:36:27.1026712Z Importing GPG key 0xF796ECB0: 2022-08-17T12:36:27.1027501Z Userid : "NVIDIA CORPORATION (Open Source Projects) " 2022-08-17T12:36:27.1028170Z Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0 2022-08-17T12:36:27.1028737Z From : https://nvidia.github.io/libnvidia-container/gpgkey 2022-08-17T12:36:28.8276562Z Retrieving key from https://nvidia.github.io/nvidia-container-runtime/gpgkey 2022-08-17T12:36:28.8364356Z Importing GPG key 0xF796ECB0: 2022-08-17T12:36:28.8364751Z Userid : "NVIDIA CORPORATION (Open Source Projects) " 2022-08-17T12:36:28.8365375Z Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0 2022-08-17T12:36:28.8365926Z From : https://nvidia.github.io/nvidia-container-runtime/gpgkey 2022-08-17T12:36:29.0591893Z Retrieving key from https://nvidia.github.io/nvidia-docker/gpgkey 2022-08-17T12:36:29.0673122Z Importing GPG key 0xF796ECB0: 2022-08-17T12:36:29.0673536Z Userid : "NVIDIA CORPORATION (Open Source Projects) " 2022-08-17T12:36:29.0673943Z Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0 2022-08-17T12:36:29.0674391Z From : https://nvidia.github.io/nvidia-docker/gpgkey 2022-08-17T12:36:48.8886333Z Resolving Dependencies 2022-08-17T12:36:48.8892180Z --> Running transaction check 2022-08-17T12:36:48.8892882Z ---> Package nvidia-docker2.noarch 0:2.11.0-1 will be installed 2022-08-17T12:36:48.8917713Z --> Processing Dependency: nvidia-container-toolkit >= 1.10.0-1 for package: nvidia-docker2-2.11.0-1.noarch 2022-08-17T12:36:48.9244907Z --> Running transaction check 2022-08-17T12:36:48.9245730Z ---> Package nvidia-container-toolkit.x86_64 0:1.10.0-1 will be installed 2022-08-17T12:36:48.9282591Z --> Processing Dependency: libnvidia-container-tools < 2.0.0 for package: nvidia-container-toolkit-1.10.0-1.x86_64 2022-08-17T12:36:48.9403976Z --> Processing Dependency: libnvidia-container-tools >= 1.10.0-1 for package: nvidia-container-toolkit-1.10.0-1.x86_64 2022-08-17T12:36:48.9404517Z --> Running transaction check 2022-08-17T12:36:48.9405122Z ---> Package libnvidia-container-tools.x86_64 0:1.10.0-1 will be installed 2022-08-17T12:36:48.9416129Z --> Processing Dependency: libnvidia-container1(x86-64) >= 1.10.0-1 for package: libnvidia-container-tools-1.10.0-1.x86_64 2022-08-17T12:36:48.9441499Z --> Processing Dependency: libnvidia-container.so.1(NVC_1.0)(64bit) for package: libnvidia-container-tools-1.10.0-1.x86_64 2022-08-17T12:36:48.9442229Z --> Processing Dependency: libnvidia-container.so.1()(64bit) for package: libnvidia-container-tools-1.10.0-1.x86_64 2022-08-17T12:36:48.9442720Z --> Running transaction check 2022-08-17T12:36:48.9443144Z ---> Package libnvidia-container1.x86_64 0:1.10.0-1 will be installed 2022-08-17T12:36:49.2341184Z --> Finished Dependency Resolution 2022-08-17T12:36:49.2974396Z 2022-08-17T12:36:49.2974597Z Dependencies Resolved 2022-08-17T12:36:49.2986209Z 2022-08-17T12:36:49.2986571Z ================================================================================ 2022-08-17T12:36:49.2986948Z Package Arch Version Repository Size 2022-08-17T12:36:49.2987301Z ================================================================================ 2022-08-17T12:36:49.2990345Z Installing: 2022-08-17T12:36:49.2990924Z nvidia-docker2 noarch 2.11.0-1 libnvidia-container 8.7 k 2022-08-17T12:36:49.2991303Z Installing for dependencies: 2022-08-17T12:36:49.2991777Z libnvidia-container-tools x86_64 1.10.0-1 libnvidia-container 49 k 2022-08-17T12:36:49.2992298Z libnvidia-container1 x86_64 1.10.0-1 libnvidia-container 1.0 M 2022-08-17T12:36:49.2992807Z nvidia-container-toolkit x86_64 1.10.0-1 libnvidia-container 3.1 M 2022-08-17T12:36:49.2993060Z 2022-08-17T12:36:49.2993530Z Transaction Summary 2022-08-17T12:36:49.2993808Z ================================================================================ 2022-08-17T12:36:49.2994137Z Install 1 Package (+3 Dependent packages) 2022-08-17T12:36:49.2994341Z 2022-08-17T12:36:49.2994469Z Total download size: 4.1 M 2022-08-17T12:36:49.2996356Z Installed size: 12 M 2022-08-17T12:36:49.2996747Z Downloading packages: 2022-08-17T12:36:49.3967658Z -------------------------------------------------------------------------------- 2022-08-17T12:36:49.3968122Z Total 42 MB/s | 4.1 MB 00:00 2022-08-17T12:36:49.4010147Z Running transaction check 2022-08-17T12:36:49.4072125Z Running transaction test 2022-08-17T12:36:49.4220030Z Transaction test succeeded 2022-08-17T12:36:49.4223369Z Running transaction 2022-08-17T12:36:53.7534001Z Installing : libnvidia-container1-1.10.0-1.x86_64 1/4 2022-08-17T12:36:56.0165113Z Installing : libnvidia-container-tools-1.10.0-1.x86_64 2/4 2022-08-17T12:36:56.0383696Z Installing : nvidia-container-toolkit-1.10.0-1.x86_64 3/4 2022-08-17T12:36:56.0826235Z Installing : nvidia-docker2-2.11.0-1.noarch 4/4 2022-08-17T12:36:56.0920813Z Verifying : libnvidia-container-tools-1.10.0-1.x86_64 1/4 2022-08-17T12:36:56.1092492Z Verifying : libnvidia-container1-1.10.0-1.x86_64 2/4 2022-08-17T12:36:56.1196004Z Verifying : nvidia-container-toolkit-1.10.0-1.x86_64 3/4 2022-08-17T12:36:56.1888453Z Verifying : nvidia-docker2-2.11.0-1.noarch 4/4 2022-08-17T12:36:56.1888710Z 2022-08-17T12:36:56.1888831Z Installed: 2022-08-17T12:36:56.1889260Z nvidia-docker2.noarch 0:2.11.0-1 2022-08-17T12:36:56.1889483Z 2022-08-17T12:36:56.1889595Z Dependency Installed: 2022-08-17T12:36:56.1890031Z libnvidia-container-tools.x86_64 0:1.10.0-1 2022-08-17T12:36:56.1890718Z libnvidia-container1.x86_64 0:1.10.0-1 2022-08-17T12:36:56.1891220Z nvidia-container-toolkit.x86_64 0:1.10.0-1 2022-08-17T12:36:56.1891431Z 2022-08-17T12:36:56.1891538Z Complete! 2022-08-17T12:36:56.2837504Z + sudo systemctl restart docker 2022-08-17T12:36:56.7559153Z == Installing nvidia driver NVIDIA-Linux-x86_64-515.57.run == 2022-08-17T12:36:56.7561016Z + sudo yum groupinstall -y 'Development Tools' 2022-08-17T12:36:57.2036919Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-08-17T12:36:57.2200894Z Existing lock /var/run/yum.pid: another copy is running as pid 18426. 2022-08-17T12:36:57.2201516Z Another app is currently holding the yum lock; waiting for it to exit... 2022-08-17T12:36:57.2210495Z The other application is: yum 2022-08-17T12:36:57.2210899Z Memory : 88 M RSS (305 MB VSZ) 2022-08-17T12:36:57.2212065Z Started: Wed Aug 17 12:36:55 2022 - 00:02 ago 2022-08-17T12:36:57.2212491Z State : Running, pid: 18426 2022-08-17T12:36:59.2238351Z Another app is currently holding the yum lock; waiting for it to exit... 2022-08-17T12:36:59.2246192Z The other application is: yum 2022-08-17T12:36:59.2246737Z Memory : 158 M RSS (376 MB VSZ) 2022-08-17T12:36:59.2247743Z Started: Wed Aug 17 12:36:55 2022 - 00:04 ago 2022-08-17T12:36:59.2248063Z State : Running, pid: 18426 2022-08-17T12:37:02.2459136Z Resolving Dependencies 2022-08-17T12:37:02.2465496Z --> Running transaction check 2022-08-17T12:37:02.2470046Z ---> Package autoconf.noarch 0:2.69-11.amzn2 will be installed 2022-08-17T12:37:02.2699561Z --> Processing Dependency: m4 >= 1.4.14 for package: autoconf-2.69-11.amzn2.noarch 2022-08-17T12:37:02.3059572Z --> Processing Dependency: perl(Data::Dumper) for package: autoconf-2.69-11.amzn2.noarch 2022-08-17T12:37:02.3064920Z ---> Package automake.noarch 0:1.13.4-3.1.amzn2 will be installed 2022-08-17T12:37:02.3129142Z --> Processing Dependency: perl(Thread::Queue) for package: automake-1.13.4-3.1.amzn2.noarch 2022-08-17T12:37:02.3138552Z --> Processing Dependency: perl(TAP::Parser) for package: automake-1.13.4-3.1.amzn2.noarch 2022-08-17T12:37:02.3151951Z ---> Package bison.x86_64 0:3.0.4-6.amzn2.0.2 will be installed 2022-08-17T12:37:02.3264890Z ---> Package byacc.x86_64 0:1.9.20130304-3.amzn2.0.2 will be installed 2022-08-17T12:37:02.3276555Z ---> Package cscope.x86_64 0:15.8-10.amzn2.0.2 will be installed 2022-08-17T12:37:02.3328731Z --> Processing Dependency: emacs-filesystem for package: cscope-15.8-10.amzn2.0.2.x86_64 2022-08-17T12:37:02.3356114Z ---> Package ctags.x86_64 0:5.8-13.amzn2.0.2 will be installed 2022-08-17T12:37:02.3370143Z ---> Package diffstat.x86_64 0:1.57-4.amzn2.0.2 will be installed 2022-08-17T12:37:02.3383147Z ---> Package doxygen.x86_64 1:1.8.5-4.amzn2 will be installed 2022-08-17T12:37:02.3486167Z ---> Package elfutils.x86_64 0:0.176-2.amzn2 will be installed 2022-08-17T12:37:02.3652014Z ---> Package flex.x86_64 0:2.5.37-3.amzn2.0.3 will be installed 2022-08-17T12:37:02.3684267Z ---> Package gcc.x86_64 0:7.3.1-15.amzn2 will be installed 2022-08-17T12:37:02.3851937Z --> Processing Dependency: cpp = 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-08-17T12:37:02.3869351Z --> Processing Dependency: libsanitizer >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-08-17T12:37:02.3922232Z --> Processing Dependency: libquadmath >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-08-17T12:37:02.3973010Z --> Processing Dependency: libmpx >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-08-17T12:37:02.4026472Z --> Processing Dependency: libitm >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-08-17T12:37:02.4077763Z --> Processing Dependency: libcilkrts >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-08-17T12:37:02.4130084Z --> Processing Dependency: libatomic >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-08-17T12:37:02.4181996Z --> Processing Dependency: glibc-devel >= 2.2.90-12 for package: gcc-7.3.1-15.amzn2.x86_64 2022-08-17T12:37:02.4333769Z --> Processing Dependency: libmpfr.so.4()(64bit) for package: gcc-7.3.1-15.amzn2.x86_64 2022-08-17T12:37:02.4357085Z --> Processing Dependency: libmpc.so.3()(64bit) for package: gcc-7.3.1-15.amzn2.x86_64 2022-08-17T12:37:02.4380324Z ---> Package gcc-c++.x86_64 0:7.3.1-15.amzn2 will be installed 2022-08-17T12:37:02.4419661Z ---> Package gcc-gfortran.x86_64 0:7.3.1-15.amzn2 will be installed 2022-08-17T12:37:02.4464469Z --> Processing Dependency: libgfortran.so.4()(64bit) for package: gcc-gfortran-7.3.1-15.amzn2.x86_64 2022-08-17T12:37:02.4527852Z ---> Package indent.x86_64 0:2.2.11-13.amzn2.0.2 will be installed 2022-08-17T12:37:02.4551475Z ---> Package intltool.noarch 0:0.50.2-7.amzn2 will be installed 2022-08-17T12:37:02.4606196Z --> Processing Dependency: perl(XML::Parser) for package: intltool-0.50.2-7.amzn2.noarch 2022-08-17T12:37:02.4622084Z --> Processing Dependency: gettext-devel for package: intltool-0.50.2-7.amzn2.noarch 2022-08-17T12:37:02.4642727Z ---> Package libtool.x86_64 0:2.4.2-22.2.amzn2.0.2 will be installed 2022-08-17T12:37:02.4680511Z ---> Package patch.x86_64 0:2.7.1-12.amzn2.0.2 will be installed 2022-08-17T12:37:02.4722589Z ---> Package patchutils.x86_64 0:0.3.3-4.amzn2.0.1 will be installed 2022-08-17T12:37:02.4754001Z ---> Package rcs.x86_64 0:5.9.0-5.amzn2.0.2 will be installed 2022-08-17T12:37:02.4797848Z ---> Package rpm-build.x86_64 0:4.11.3-48.amzn2.0.2 will be installed 2022-08-17T12:37:02.5039366Z --> Processing Dependency: /usr/bin/gdb-add-index for package: rpm-build-4.11.3-48.amzn2.0.2.x86_64 2022-08-17T12:37:02.5059069Z ---> Package rpm-sign.x86_64 0:4.11.3-48.amzn2.0.2 will be installed 2022-08-17T12:37:02.5095372Z ---> Package subversion.x86_64 0:1.7.14-16.amzn2.0.1 will be installed 2022-08-17T12:37:02.5259689Z --> Processing Dependency: subversion-libs(x86-64) = 1.7.14-16.amzn2.0.1 for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5280368Z --> Processing Dependency: libsvn_wc-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5281516Z --> Processing Dependency: libsvn_subr-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5282145Z --> Processing Dependency: libsvn_repos-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5283025Z --> Processing Dependency: libsvn_ra_svn-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5283655Z --> Processing Dependency: libsvn_ra_neon-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5284628Z --> Processing Dependency: libsvn_ra_local-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5285480Z --> Processing Dependency: libsvn_ra-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5286119Z --> Processing Dependency: libsvn_fs_util-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5286754Z --> Processing Dependency: libsvn_fs_fs-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5287385Z --> Processing Dependency: libsvn_fs_base-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5287991Z --> Processing Dependency: libsvn_fs-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5288615Z --> Processing Dependency: libsvn_diff-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5289244Z --> Processing Dependency: libsvn_delta-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5289875Z --> Processing Dependency: libsvn_client-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5290470Z --> Processing Dependency: libneon.so.27()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5311167Z --> Processing Dependency: libaprutil-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5333986Z --> Processing Dependency: libapr-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-08-17T12:37:02.5360072Z ---> Package swig.x86_64 0:3.0.12-11.amzn2.0.3 will be installed 2022-08-17T12:37:02.5388714Z ---> Package system-rpm-config.noarch 0:9.1.0-76.amzn2.0.14 will be installed 2022-08-17T12:37:02.5433447Z --> Processing Dependency: dwz >= 0.4 for package: system-rpm-config-9.1.0-76.amzn2.0.14.noarch 2022-08-17T12:37:02.5451617Z --> Processing Dependency: perl-srpm-macros for package: system-rpm-config-9.1.0-76.amzn2.0.14.noarch 2022-08-17T12:37:02.5465956Z --> Processing Dependency: go-srpm-macros for package: system-rpm-config-9.1.0-76.amzn2.0.14.noarch 2022-08-17T12:37:02.5632534Z ---> Package systemtap.x86_64 0:4.5-1.amzn2.0.1 will be installed 2022-08-17T12:37:02.5648823Z --> Processing Dependency: systemtap-devel = 4.5-1.amzn2.0.1 for package: systemtap-4.5-1.amzn2.0.1.x86_64 2022-08-17T12:37:02.5661650Z --> Processing Dependency: systemtap-client = 4.5-1.amzn2.0.1 for package: systemtap-4.5-1.amzn2.0.1.x86_64 2022-08-17T12:37:02.5673857Z --> Running transaction check 2022-08-17T12:37:02.5677115Z ---> Package apr.x86_64 0:1.7.0-9.amzn2 will be installed 2022-08-17T12:37:02.5763167Z ---> Package apr-util.x86_64 0:1.6.1-5.amzn2.0.2 will be installed 2022-08-17T12:37:02.5812142Z --> Processing Dependency: apr-util-bdb(x86-64) = 1.6.1-5.amzn2.0.2 for package: apr-util-1.6.1-5.amzn2.0.2.x86_64 2022-08-17T12:37:02.5828275Z ---> Package cpp.x86_64 0:7.3.1-15.amzn2 will be installed 2022-08-17T12:37:02.5912416Z ---> Package dwz.x86_64 0:0.11-3.amzn2.0.3 will be installed 2022-08-17T12:37:02.5928529Z ---> Package emacs-filesystem.noarch 1:27.2-4.amzn2.0.1 will be installed 2022-08-17T12:37:02.5929368Z ---> Package gdb.x86_64 0:8.0.1-36.amzn2.0.1 will be installed 2022-08-17T12:37:02.6016551Z ---> Package gettext-devel.x86_64 0:0.19.8.1-3.amzn2 will be installed 2022-08-17T12:37:02.6081300Z --> Processing Dependency: gettext-common-devel = 0.19.8.1-3.amzn2 for package: gettext-devel-0.19.8.1-3.amzn2.x86_64 2022-08-17T12:37:02.6091037Z ---> Package glibc-devel.x86_64 0:2.26-60.amzn2 will be installed 2022-08-17T12:37:02.6212387Z --> Processing Dependency: glibc-headers = 2.26-60.amzn2 for package: glibc-devel-2.26-60.amzn2.x86_64 2022-08-17T12:37:02.6239754Z --> Processing Dependency: glibc-headers for package: glibc-devel-2.26-60.amzn2.x86_64 2022-08-17T12:37:02.6240413Z ---> Package go-srpm-macros.noarch 0:3.0.15-23.amzn2.0.1 will be installed 2022-08-17T12:37:02.6246168Z ---> Package libatomic.x86_64 0:7.3.1-15.amzn2 will be installed 2022-08-17T12:37:02.6268682Z ---> Package libcilkrts.x86_64 0:7.3.1-15.amzn2 will be installed 2022-08-17T12:37:02.6308946Z ---> Package libgfortran.x86_64 0:7.3.1-15.amzn2 will be installed 2022-08-17T12:37:02.6357838Z ---> Package libitm.x86_64 0:7.3.1-15.amzn2 will be installed 2022-08-17T12:37:02.6383887Z ---> Package libmpc.x86_64 0:1.0.1-3.amzn2.0.2 will be installed 2022-08-17T12:37:02.6403487Z ---> Package libmpx.x86_64 0:7.3.1-15.amzn2 will be installed 2022-08-17T12:37:02.6428432Z ---> Package libquadmath.x86_64 0:7.3.1-15.amzn2 will be installed 2022-08-17T12:37:02.6465171Z ---> Package libsanitizer.x86_64 0:7.3.1-15.amzn2 will be installed 2022-08-17T12:37:02.6528008Z ---> Package m4.x86_64 0:1.4.16-10.amzn2.0.2 will be installed 2022-08-17T12:37:02.6552444Z ---> Package mpfr.x86_64 0:3.1.1-4.amzn2.0.2 will be installed 2022-08-17T12:37:02.6583527Z ---> Package neon.x86_64 0:0.30.0-3.amzn2.0.2 will be installed 2022-08-17T12:37:02.6663297Z --> Processing Dependency: libgnutls.so.28(GNUTLS_2_12)(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-08-17T12:37:02.6701884Z --> Processing Dependency: libgnutls.so.28(GNUTLS_1_4)(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-08-17T12:37:02.6702959Z --> Processing Dependency: libproxy.so.1()(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-08-17T12:37:02.6724351Z --> Processing Dependency: libpakchois.so.0()(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-08-17T12:37:02.6744335Z --> Processing Dependency: libgnutls.so.28()(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-08-17T12:37:02.6752688Z ---> Package perl-Data-Dumper.x86_64 0:2.145-3.amzn2.0.2 will be installed 2022-08-17T12:37:02.6809184Z ---> Package perl-Test-Harness.noarch 0:3.28-3.amzn2 will be installed 2022-08-17T12:37:02.6948697Z ---> Package perl-Thread-Queue.noarch 0:3.02-2.amzn2 will be installed 2022-08-17T12:37:02.6964198Z ---> Package perl-XML-Parser.x86_64 0:2.41-10.amzn2.0.2 will be installed 2022-08-17T12:37:02.6987317Z ---> Package perl-srpm-macros.noarch 0:1-8.amzn2.0.1 will be installed 2022-08-17T12:37:02.6988341Z ---> Package subversion-libs.x86_64 0:1.7.14-16.amzn2.0.1 will be installed 2022-08-17T12:37:02.7036894Z ---> Package systemtap-client.x86_64 0:4.5-1.amzn2.0.1 will be installed 2022-08-17T12:37:02.7290397Z --> Processing Dependency: mokutil for package: systemtap-client-4.5-1.amzn2.0.1.x86_64 2022-08-17T12:37:02.7306185Z --> Processing Dependency: libavahi-common.so.3()(64bit) for package: systemtap-client-4.5-1.amzn2.0.1.x86_64 2022-08-17T12:37:02.7333363Z --> Processing Dependency: libavahi-client.so.3()(64bit) for package: systemtap-client-4.5-1.amzn2.0.1.x86_64 2022-08-17T12:37:02.7334105Z ---> Package systemtap-devel.x86_64 0:4.5-1.amzn2.0.1 will be installed 2022-08-17T12:37:02.7459155Z --> Processing Dependency: kernel-devel-uname-r for package: systemtap-devel-4.5-1.amzn2.0.1.x86_64 2022-08-17T12:37:02.8413383Z --> Running transaction check 2022-08-17T12:37:02.8414100Z ---> Package apr-util-bdb.x86_64 0:1.6.1-5.amzn2.0.2 will be installed 2022-08-17T12:37:02.8428417Z ---> Package avahi-libs.x86_64 0:0.6.31-20.amzn2 will be installed 2022-08-17T12:37:02.8462590Z ---> Package gettext-common-devel.noarch 0:0.19.8.1-3.amzn2 will be installed 2022-08-17T12:37:02.8463734Z ---> Package glibc-headers.x86_64 0:2.26-60.amzn2 will be installed 2022-08-17T12:37:02.8508594Z --> Processing Dependency: kernel-headers >= 2.2.1 for package: glibc-headers-2.26-60.amzn2.x86_64 2022-08-17T12:37:02.9531507Z --> Processing Dependency: kernel-headers for package: glibc-headers-2.26-60.amzn2.x86_64 2022-08-17T12:37:02.9532040Z ---> Package gnutls.x86_64 0:3.3.29-9.amzn2.0.1 will be installed 2022-08-17T12:37:02.9606094Z --> Processing Dependency: trousers >= 0.3.11.2 for package: gnutls-3.3.29-9.amzn2.0.1.x86_64 2022-08-17T12:37:02.9635057Z ---> Package kernel-devel.x86_64 0:4.14.287-215.504.amzn2 will be installed 2022-08-17T12:37:02.9664811Z --> Processing Dependency: elfutils-libelf-devel for package: kernel-devel-4.14.287-215.504.amzn2.x86_64 2022-08-17T12:37:02.9687037Z ---> Package libproxy.x86_64 0:0.4.11-10.amzn2.0.3 will be installed 2022-08-17T12:37:02.9727625Z --> Processing Dependency: libmodman.so.1()(64bit) for package: libproxy-0.4.11-10.amzn2.0.3.x86_64 2022-08-17T12:37:02.9747685Z ---> Package mokutil.x86_64 1:0.3.0-10.amzn2.0.1 will be installed 2022-08-17T12:37:02.9797538Z --> Processing Dependency: libefivar.so.1(libefivar.so.0)(64bit) for package: 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 2022-08-17T12:37:02.9820162Z --> Processing Dependency: libefivar.so.1(LIBEFIVAR_0.24)(64bit) for package: 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 2022-08-17T12:37:02.9821051Z --> Processing Dependency: libefivar.so.1()(64bit) for package: 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 2022-08-17T12:37:02.9821870Z ---> Package pakchois.x86_64 0:0.4-10.amzn2.0.2 will be installed 2022-08-17T12:37:02.9842621Z --> Running transaction check 2022-08-17T12:37:02.9843429Z ---> Package efivar-libs.x86_64 0:31-4.amzn2.0.4 will be installed 2022-08-17T12:37:02.9870240Z ---> Package elfutils-libelf-devel.x86_64 0:0.176-2.amzn2 will be installed 2022-08-17T12:37:02.9883277Z --> Processing Dependency: pkgconfig(zlib) for package: elfutils-libelf-devel-0.176-2.amzn2.x86_64 2022-08-17T12:37:02.9908556Z ---> Package kernel-headers.x86_64 0:4.14.287-215.504.amzn2 will be installed 2022-08-17T12:37:02.9909509Z ---> Package libmodman.x86_64 0:2.0.1-8.amzn2.0.2 will be installed 2022-08-17T12:37:02.9937915Z ---> Package trousers.x86_64 0:0.3.14-2.amzn2.0.2 will be installed 2022-08-17T12:37:03.0009812Z --> Running transaction check 2022-08-17T12:37:03.0010298Z ---> Package zlib-devel.x86_64 0:1.2.7-19.amzn2.0.1 will be installed 2022-08-17T12:37:03.2564057Z --> Finished Dependency Resolution 2022-08-17T12:37:03.3799015Z 2022-08-17T12:37:03.3799639Z Dependencies Resolved 2022-08-17T12:37:03.3912966Z 2022-08-17T12:37:03.3913493Z ================================================================================ 2022-08-17T12:37:03.3913968Z Package Arch Version Repository Size 2022-08-17T12:37:03.3914646Z ================================================================================ 2022-08-17T12:37:03.3915379Z Installing for group install "Development Tools": 2022-08-17T12:37:03.3916225Z autoconf noarch 2.69-11.amzn2 amzn2-core 701 k 2022-08-17T12:37:03.3916664Z automake noarch 1.13.4-3.1.amzn2 amzn2-core 679 k 2022-08-17T12:37:03.3917114Z bison x86_64 3.0.4-6.amzn2.0.2 amzn2-core 674 k 2022-08-17T12:37:03.3917547Z byacc x86_64 1.9.20130304-3.amzn2.0.2 amzn2-core 66 k 2022-08-17T12:37:03.3917965Z cscope x86_64 15.8-10.amzn2.0.2 amzn2-core 204 k 2022-08-17T12:37:03.3918395Z ctags x86_64 5.8-13.amzn2.0.2 amzn2-core 157 k 2022-08-17T12:37:03.3918827Z diffstat x86_64 1.57-4.amzn2.0.2 amzn2-core 35 k 2022-08-17T12:37:03.3919269Z doxygen x86_64 1:1.8.5-4.amzn2 amzn2-core 3.5 M 2022-08-17T12:37:03.3919951Z elfutils x86_64 0.176-2.amzn2 amzn2-core 307 k 2022-08-17T12:37:03.3920813Z flex x86_64 2.5.37-3.amzn2.0.3 amzn2-core 291 k 2022-08-17T12:37:03.3921667Z gcc x86_64 7.3.1-15.amzn2 amzn2-core 22 M 2022-08-17T12:37:03.3922503Z gcc-c++ x86_64 7.3.1-15.amzn2 amzn2-core 13 M 2022-08-17T12:37:03.3922968Z gcc-gfortran x86_64 7.3.1-15.amzn2 amzn2-core 11 M 2022-08-17T12:37:03.3923408Z indent x86_64 2.2.11-13.amzn2.0.2 amzn2-core 150 k 2022-08-17T12:37:03.3923869Z intltool noarch 0.50.2-7.amzn2 amzn2-core 59 k 2022-08-17T12:37:03.3924648Z libtool x86_64 2.4.2-22.2.amzn2.0.2 amzn2-core 588 k 2022-08-17T12:37:03.3925157Z patch x86_64 2.7.1-12.amzn2.0.2 amzn2-core 110 k 2022-08-17T12:37:03.3925589Z patchutils x86_64 0.3.3-4.amzn2.0.1 amzn2-core 104 k 2022-08-17T12:37:03.3926191Z rcs x86_64 5.9.0-5.amzn2.0.2 amzn2-core 231 k 2022-08-17T12:37:03.3926627Z rpm-build x86_64 4.11.3-48.amzn2.0.2 amzn2-core 150 k 2022-08-17T12:37:03.3927115Z rpm-sign x86_64 4.11.3-48.amzn2.0.2 amzn2-core 50 k 2022-08-17T12:37:03.3927557Z subversion x86_64 1.7.14-16.amzn2.0.1 amzn2-core 1.0 M 2022-08-17T12:37:03.3927971Z swig x86_64 3.0.12-11.amzn2.0.3 amzn2-core 1.4 M 2022-08-17T12:37:03.3928421Z system-rpm-config noarch 9.1.0-76.amzn2.0.14 amzn2-core 90 k 2022-08-17T12:37:03.3928875Z systemtap x86_64 4.5-1.amzn2.0.1 amzn2-core 12 k 2022-08-17T12:37:03.3929205Z Installing for dependencies: 2022-08-17T12:37:03.3929600Z apr x86_64 1.7.0-9.amzn2 amzn2-core 122 k 2022-08-17T12:37:03.3930033Z apr-util x86_64 1.6.1-5.amzn2.0.2 amzn2-core 99 k 2022-08-17T12:37:03.3930481Z apr-util-bdb x86_64 1.6.1-5.amzn2.0.2 amzn2-core 19 k 2022-08-17T12:37:03.3930907Z avahi-libs x86_64 0.6.31-20.amzn2 amzn2-core 61 k 2022-08-17T12:37:03.3931344Z cpp x86_64 7.3.1-15.amzn2 amzn2-core 9.2 M 2022-08-17T12:37:03.3931851Z dwz x86_64 0.11-3.amzn2.0.3 amzn2-core 98 k 2022-08-17T12:37:03.3932302Z efivar-libs x86_64 31-4.amzn2.0.4 amzn2-core 68 k 2022-08-17T12:37:03.3932745Z elfutils-libelf-devel x86_64 0.176-2.amzn2 amzn2-core 40 k 2022-08-17T12:37:03.3933215Z emacs-filesystem noarch 1:27.2-4.amzn2.0.1 amzn2-core 67 k 2022-08-17T12:37:03.3933658Z gdb x86_64 8.0.1-36.amzn2.0.1 amzn2-core 3.1 M 2022-08-17T12:37:03.3934096Z gettext-common-devel noarch 0.19.8.1-3.amzn2 amzn2-core 410 k 2022-08-17T12:37:03.3934569Z gettext-devel x86_64 0.19.8.1-3.amzn2 amzn2-core 320 k 2022-08-17T12:37:03.3935013Z glibc-devel x86_64 2.26-60.amzn2 amzn2-core 994 k 2022-08-17T12:37:03.3935455Z glibc-headers x86_64 2.26-60.amzn2 amzn2-core 515 k 2022-08-17T12:37:03.3935879Z gnutls x86_64 3.3.29-9.amzn2.0.1 amzn2-core 661 k 2022-08-17T12:37:03.3936323Z go-srpm-macros noarch 3.0.15-23.amzn2.0.1 amzn2-core 23 k 2022-08-17T12:37:03.3936783Z kernel-devel x86_64 4.14.287-215.504.amzn2 amzn2-core 13 M 2022-08-17T12:37:03.3937213Z kernel-headers x86_64 4.14.287-215.504.amzn2 amzn2-core 1.2 M 2022-08-17T12:37:03.3937653Z libatomic x86_64 7.3.1-15.amzn2 amzn2-core 46 k 2022-08-17T12:37:03.3938080Z libcilkrts x86_64 7.3.1-15.amzn2 amzn2-core 85 k 2022-08-17T12:37:03.3938513Z libgfortran x86_64 7.3.1-15.amzn2 amzn2-core 536 k 2022-08-17T12:37:03.3938927Z libitm x86_64 7.3.1-15.amzn2 amzn2-core 85 k 2022-08-17T12:37:03.3939355Z libmodman x86_64 2.0.1-8.amzn2.0.2 amzn2-core 29 k 2022-08-17T12:37:03.3939793Z libmpc x86_64 1.0.1-3.amzn2.0.2 amzn2-core 52 k 2022-08-17T12:37:03.3940203Z libmpx x86_64 7.3.1-15.amzn2 amzn2-core 51 k 2022-08-17T12:37:03.3940631Z libproxy x86_64 0.4.11-10.amzn2.0.3 amzn2-core 61 k 2022-08-17T12:37:03.3941068Z libquadmath x86_64 7.3.1-15.amzn2 amzn2-core 189 k 2022-08-17T12:37:03.3941503Z libsanitizer x86_64 7.3.1-15.amzn2 amzn2-core 642 k 2022-08-17T12:37:03.3941919Z m4 x86_64 1.4.16-10.amzn2.0.2 amzn2-core 256 k 2022-08-17T12:37:03.3942338Z mokutil x86_64 1:0.3.0-10.amzn2.0.1 amzn2-core 39 k 2022-08-17T12:37:03.3942827Z mpfr x86_64 3.1.1-4.amzn2.0.2 amzn2-core 208 k 2022-08-17T12:37:03.3943230Z neon x86_64 0.30.0-3.amzn2.0.2 amzn2-core 166 k 2022-08-17T12:37:03.3944274Z pakchois x86_64 0.4-10.amzn2.0.2 amzn2-core 14 k 2022-08-17T12:37:03.3944737Z perl-Data-Dumper x86_64 2.145-3.amzn2.0.2 amzn2-core 48 k 2022-08-17T12:37:03.3945204Z perl-Test-Harness noarch 3.28-3.amzn2 amzn2-core 302 k 2022-08-17T12:37:03.3945656Z perl-Thread-Queue noarch 3.02-2.amzn2 amzn2-core 17 k 2022-08-17T12:37:03.3946133Z perl-XML-Parser x86_64 2.41-10.amzn2.0.2 amzn2-core 223 k 2022-08-17T12:37:03.3946604Z perl-srpm-macros noarch 1-8.amzn2.0.1 amzn2-core 4.7 k 2022-08-17T12:37:03.3947060Z subversion-libs x86_64 1.7.14-16.amzn2.0.1 amzn2-core 912 k 2022-08-17T12:37:03.3947515Z systemtap-client x86_64 4.5-1.amzn2.0.1 amzn2-core 3.7 M 2022-08-17T12:37:03.3947977Z systemtap-devel x86_64 4.5-1.amzn2.0.1 amzn2-core 2.3 M 2022-08-17T12:37:03.3948424Z trousers x86_64 0.3.14-2.amzn2.0.2 amzn2-core 294 k 2022-08-17T12:37:03.3948939Z zlib-devel x86_64 1.2.7-19.amzn2.0.1 amzn2-core 50 k 2022-08-17T12:37:03.3949160Z 2022-08-17T12:37:03.3949275Z Transaction Summary 2022-08-17T12:37:03.3949564Z ================================================================================ 2022-08-17T12:37:03.3949863Z Install 25 Packages (+43 Dependent packages) 2022-08-17T12:37:03.3950062Z 2022-08-17T12:37:03.3950194Z Total download size: 96 M 2022-08-17T12:37:03.3950455Z Installed size: 303 M 2022-08-17T12:37:03.3950699Z Downloading packages: 2022-08-17T12:37:03.3965892Z Delta RPMs disabled because /usr/bin/applydeltarpm not installed. 2022-08-17T12:37:04.7881918Z -------------------------------------------------------------------------------- 2022-08-17T12:37:04.7882387Z Total 69 MB/s | 96 MB 00:01 2022-08-17T12:37:04.8939878Z Running transaction check 2022-08-17T12:37:04.9724308Z Running transaction test 2022-08-17T12:37:05.4123821Z Transaction test succeeded 2022-08-17T12:37:05.4127524Z Running transaction 2022-08-17T12:37:05.9374979Z Installing : mpfr-3.1.1-4.amzn2.0.2.x86_64 1/68 2022-08-17T12:37:05.9861505Z Installing : libmpc-1.0.1-3.amzn2.0.2.x86_64 2/68 2022-08-17T12:37:06.0258779Z Installing : m4-1.4.16-10.amzn2.0.2.x86_64 3/68 2022-08-17T12:37:06.0526373Z Installing : apr-1.7.0-9.amzn2.x86_64 4/68 2022-08-17T12:37:06.0799169Z Installing : apr-util-bdb-1.6.1-5.amzn2.0.2.x86_64 5/68 2022-08-17T12:37:06.1106325Z Installing : apr-util-1.6.1-5.amzn2.0.2.x86_64 6/68 2022-08-17T12:37:06.1544017Z Installing : avahi-libs-0.6.31-20.amzn2.x86_64 7/68 2022-08-17T12:37:06.1938143Z Installing : libquadmath-7.3.1-15.amzn2.x86_64 8/68 2022-08-17T12:37:06.2168243Z Installing : patch-2.7.1-12.amzn2.0.2.x86_64 9/68 2022-08-17T12:37:06.2984831Z Installing : perl-Thread-Queue-3.02-2.amzn2.noarch 10/68 2022-08-17T12:37:07.3574435Z Installing : libgfortran-7.3.1-15.amzn2.x86_64 11/68 2022-08-17T12:37:07.3969919Z Installing : cpp-7.3.1-15.amzn2.x86_64 12/68 2022-08-17T12:37:07.4167605Z Installing : zlib-devel-1.2.7-19.amzn2.0.1.x86_64 13/68 2022-08-17T12:37:07.4377373Z Installing : elfutils-libelf-devel-0.176-2.amzn2.x86_64 14/68 2022-08-17T12:37:07.4722887Z Installing : libmodman-2.0.1-8.amzn2.0.2.x86_64 15/68 2022-08-17T12:37:07.5323559Z Installing : libproxy-0.4.11-10.amzn2.0.3.x86_64 16/68 2022-08-17T12:37:07.5882328Z Installing : perl-XML-Parser-2.41-10.amzn2.0.2.x86_64 17/68 2022-08-17T12:37:07.6998669Z Installing : elfutils-0.176-2.amzn2.x86_64 18/68 2022-08-17T12:37:07.7302666Z Installing : libsanitizer-7.3.1-15.amzn2.x86_64 19/68 2022-08-17T12:37:07.7530402Z Installing : 1:emacs-filesystem-27.2-4.amzn2.0.1.noarch 20/68 2022-08-17T12:37:07.7875254Z Installing : efivar-libs-31-4.amzn2.0.4.x86_64 21/68 2022-08-17T12:37:07.8202755Z Installing : 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 22/68 2022-08-17T12:37:07.9033329Z Installing : dwz-0.11-3.amzn2.0.3.x86_64 23/68 2022-08-17T12:37:08.0784703Z Installing : trousers-0.3.14-2.amzn2.0.2.x86_64 24/68 2022-08-17T12:37:08.1139973Z Installing : gnutls-3.3.29-9.amzn2.0.1.x86_64 25/68 2022-08-17T12:37:08.5294453Z Installing : libitm-7.3.1-15.amzn2.x86_64 26/68 2022-08-17T12:37:08.8198347Z Installing : gdb-8.0.1-36.amzn2.0.1.x86_64 27/68 2022-08-17T12:37:08.9921223Z Installing : kernel-headers-4.14.287-215.504.amzn2.x86_64 28/68 2022-08-17T12:37:09.1290747Z Installing : glibc-headers-2.26-60.amzn2.x86_64 29/68 2022-08-17T12:37:09.1716125Z Installing : glibc-devel-2.26-60.amzn2.x86_64 30/68 2022-08-17T12:37:09.2004135Z Installing : libmpx-7.3.1-15.amzn2.x86_64 31/68 2022-08-17T12:37:09.2303543Z Installing : perl-srpm-macros-1-8.amzn2.0.1.noarch 32/68 2022-08-17T12:37:09.2558076Z Installing : system-rpm-config-9.1.0-76.amzn2.0.14.noarch 33/68 2022-08-17T12:37:09.2783971Z Installing : go-srpm-macros-3.0.15-23.amzn2.0.1.noarch 34/68 2022-08-17T12:37:09.3732217Z Installing : perl-Data-Dumper-2.145-3.amzn2.0.2.x86_64 35/68 2022-08-17T12:37:09.4208222Z Installing : autoconf-2.69-11.amzn2.noarch 36/68 2022-08-17T12:37:09.5106412Z Installing : gettext-common-devel-0.19.8.1-3.amzn2.noarch 37/68 2022-08-17T12:37:09.5952783Z Installing : gettext-devel-0.19.8.1-3.amzn2.x86_64 38/68 2022-08-17T12:37:09.6980212Z Installing : perl-Test-Harness-3.28-3.amzn2.noarch 39/68 2022-08-17T12:37:09.7362975Z Installing : automake-1.13.4-3.1.amzn2.noarch 40/68 2022-08-17T12:37:09.7675518Z Installing : libatomic-7.3.1-15.amzn2.x86_64 41/68 2022-08-17T12:37:11.7966120Z Installing : libcilkrts-7.3.1-15.amzn2.x86_64 42/68 2022-08-17T12:37:15.5437016Z Installing : gcc-7.3.1-15.amzn2.x86_64 43/68 2022-08-17T12:37:26.6389580Z Installing : kernel-devel-4.14.287-215.504.amzn2.x86_64 44/68 2022-08-17T12:37:27.2481629Z Installing : systemtap-devel-4.5-1.amzn2.0.1.x86_64 45/68 2022-08-17T12:37:27.3194822Z Installing : systemtap-client-4.5-1.amzn2.0.1.x86_64 46/68 2022-08-17T12:37:27.3808958Z Installing : pakchois-0.4-10.amzn2.0.2.x86_64 47/68 2022-08-17T12:37:27.5169870Z Installing : neon-0.30.0-3.amzn2.0.2.x86_64 48/68 2022-08-17T12:37:27.6951840Z Installing : subversion-libs-1.7.14-16.amzn2.0.1.x86_64 49/68 2022-08-17T12:37:27.7968331Z Installing : subversion-1.7.14-16.amzn2.0.1.x86_64 50/68 2022-08-17T12:37:29.0132575Z Installing : systemtap-4.5-1.amzn2.0.1.x86_64 51/68 2022-08-17T12:37:30.6300995Z Installing : gcc-gfortran-7.3.1-15.amzn2.x86_64 52/68 2022-08-17T12:37:30.7582311Z Installing : gcc-c++-7.3.1-15.amzn2.x86_64 53/68 2022-08-17T12:37:30.7969175Z Installing : libtool-2.4.2-22.2.amzn2.0.2.x86_64 54/68 2022-08-17T12:37:30.8376991Z Installing : intltool-0.50.2-7.amzn2.noarch 55/68 2022-08-17T12:37:30.8936068Z Installing : rpm-build-4.11.3-48.amzn2.0.2.x86_64 56/68 2022-08-17T12:37:30.9540091Z Installing : cscope-15.8-10.amzn2.0.2.x86_64 57/68 2022-08-17T12:37:31.0602811Z Installing : flex-2.5.37-3.amzn2.0.3.x86_64 58/68 2022-08-17T12:37:31.1244392Z Installing : bison-3.0.4-6.amzn2.0.2.x86_64 59/68 2022-08-17T12:37:31.1726406Z Installing : rcs-5.9.0-5.amzn2.0.2.x86_64 60/68 2022-08-17T12:37:31.2095008Z Installing : ctags-5.8-13.amzn2.0.2.x86_64 61/68 2022-08-17T12:37:31.2542573Z Installing : indent-2.2.11-13.amzn2.0.2.x86_64 62/68 2022-08-17T12:37:31.9226331Z Installing : patchutils-0.3.3-4.amzn2.0.1.x86_64 63/68 2022-08-17T12:37:31.9648592Z Installing : 1:doxygen-1.8.5-4.amzn2.x86_64 64/68 2022-08-17T12:37:31.9901073Z Installing : diffstat-1.57-4.amzn2.0.2.x86_64 65/68 2022-08-17T12:37:32.3035114Z Installing : byacc-1.9.20130304-3.amzn2.0.2.x86_64 66/68 2022-08-17T12:37:32.3453364Z Installing : swig-3.0.12-11.amzn2.0.3.x86_64 67/68 2022-08-17T12:37:32.4111786Z Installing : rpm-sign-4.11.3-48.amzn2.0.2.x86_64 68/68 2022-08-17T12:37:32.4241715Z Verifying : elfutils-libelf-devel-0.176-2.amzn2.x86_64 1/68 2022-08-17T12:37:32.4331031Z Verifying : perl-Thread-Queue-3.02-2.amzn2.noarch 2/68 2022-08-17T12:37:32.4417847Z Verifying : gettext-devel-0.19.8.1-3.amzn2.x86_64 3/68 2022-08-17T12:37:32.4510258Z Verifying : patch-2.7.1-12.amzn2.0.2.x86_64 4/68 2022-08-17T12:37:32.4616693Z Verifying : flex-2.5.37-3.amzn2.0.3.x86_64 5/68 2022-08-17T12:37:32.4712777Z Verifying : glibc-headers-2.26-60.amzn2.x86_64 6/68 2022-08-17T12:37:32.4805336Z Verifying : pakchois-0.4-10.amzn2.0.2.x86_64 7/68 2022-08-17T12:37:32.4895787Z Verifying : rpm-sign-4.11.3-48.amzn2.0.2.x86_64 8/68 2022-08-17T12:37:32.4985780Z Verifying : gcc-gfortran-7.3.1-15.amzn2.x86_64 9/68 2022-08-17T12:37:32.5126036Z Verifying : kernel-devel-4.14.287-215.504.amzn2.x86_64 10/68 2022-08-17T12:37:32.5213960Z Verifying : swig-3.0.12-11.amzn2.0.3.x86_64 11/68 2022-08-17T12:37:32.5300769Z Verifying : byacc-1.9.20130304-3.amzn2.0.2.x86_64 12/68 2022-08-17T12:37:32.5388875Z Verifying : libmpc-1.0.1-3.amzn2.0.2.x86_64 13/68 2022-08-17T12:37:32.5486757Z Verifying : libcilkrts-7.3.1-15.amzn2.x86_64 14/68 2022-08-17T12:37:32.5578704Z Verifying : go-srpm-macros-3.0.15-23.amzn2.0.1.noarch 15/68 2022-08-17T12:37:32.5664914Z Verifying : libproxy-0.4.11-10.amzn2.0.3.x86_64 16/68 2022-08-17T12:37:32.5761730Z Verifying : cscope-15.8-10.amzn2.0.2.x86_64 17/68 2022-08-17T12:37:32.5853942Z Verifying : diffstat-1.57-4.amzn2.0.2.x86_64 18/68 2022-08-17T12:37:32.5944296Z Verifying : 1:doxygen-1.8.5-4.amzn2.x86_64 19/68 2022-08-17T12:37:32.6038807Z Verifying : gcc-c++-7.3.1-15.amzn2.x86_64 20/68 2022-08-17T12:37:32.6129299Z Verifying : libatomic-7.3.1-15.amzn2.x86_64 21/68 2022-08-17T12:37:32.6211531Z Verifying : system-rpm-config-9.1.0-76.amzn2.0.14.noarch 22/68 2022-08-17T12:37:32.6390752Z Verifying : systemtap-devel-4.5-1.amzn2.0.1.x86_64 23/68 2022-08-17T12:37:32.6486299Z Verifying : perl-Test-Harness-3.28-3.amzn2.noarch 24/68 2022-08-17T12:37:32.6573413Z Verifying : autoconf-2.69-11.amzn2.noarch 25/68 2022-08-17T12:37:32.6656351Z Verifying : libquadmath-7.3.1-15.amzn2.x86_64 26/68 2022-08-17T12:37:32.6744493Z Verifying : intltool-0.50.2-7.amzn2.noarch 27/68 2022-08-17T12:37:32.6834758Z Verifying : apr-util-1.6.1-5.amzn2.0.2.x86_64 28/68 2022-08-17T12:37:32.6947854Z Verifying : glibc-devel-2.26-60.amzn2.x86_64 29/68 2022-08-17T12:37:32.7120340Z Verifying : cpp-7.3.1-15.amzn2.x86_64 30/68 2022-08-17T12:37:32.7205742Z Verifying : rpm-build-4.11.3-48.amzn2.0.2.x86_64 31/68 2022-08-17T12:37:32.7291407Z Verifying : gettext-common-devel-0.19.8.1-3.amzn2.noarch 32/68 2022-08-17T12:37:32.7373791Z Verifying : perl-Data-Dumper-2.145-3.amzn2.0.2.x86_64 33/68 2022-08-17T12:37:32.7456419Z Verifying : perl-srpm-macros-1-8.amzn2.0.1.noarch 34/68 2022-08-17T12:37:32.7603118Z Verifying : gnutls-3.3.29-9.amzn2.0.1.x86_64 35/68 2022-08-17T12:37:32.7690749Z Verifying : subversion-libs-1.7.14-16.amzn2.0.1.x86_64 36/68 2022-08-17T12:37:32.7784495Z Verifying : automake-1.13.4-3.1.amzn2.noarch 37/68 2022-08-17T12:37:32.7935791Z Verifying : apr-util-bdb-1.6.1-5.amzn2.0.2.x86_64 38/68 2022-08-17T12:37:32.8034214Z Verifying : libmpx-7.3.1-15.amzn2.x86_64 39/68 2022-08-17T12:37:32.8125030Z Verifying : avahi-libs-0.6.31-20.amzn2.x86_64 40/68 2022-08-17T12:37:32.8215940Z Verifying : kernel-headers-4.14.287-215.504.amzn2.x86_64 41/68 2022-08-17T12:37:32.8309960Z Verifying : bison-3.0.4-6.amzn2.0.2.x86_64 42/68 2022-08-17T12:37:32.8424640Z Verifying : libgfortran-7.3.1-15.amzn2.x86_64 43/68 2022-08-17T12:37:32.8622746Z Verifying : gdb-8.0.1-36.amzn2.0.1.x86_64 44/68 2022-08-17T12:37:32.8727735Z Verifying : patchutils-0.3.3-4.amzn2.0.1.x86_64 45/68 2022-08-17T12:37:32.8820895Z Verifying : libitm-7.3.1-15.amzn2.x86_64 46/68 2022-08-17T12:37:32.8910633Z Verifying : libtool-2.4.2-22.2.amzn2.0.2.x86_64 47/68 2022-08-17T12:37:32.9009444Z Verifying : gcc-7.3.1-15.amzn2.x86_64 48/68 2022-08-17T12:37:32.9108041Z Verifying : indent-2.2.11-13.amzn2.0.2.x86_64 49/68 2022-08-17T12:37:32.9215094Z Verifying : subversion-1.7.14-16.amzn2.0.1.x86_64 50/68 2022-08-17T12:37:32.9311227Z Verifying : apr-1.7.0-9.amzn2.x86_64 51/68 2022-08-17T12:37:32.9418305Z Verifying : ctags-5.8-13.amzn2.0.2.x86_64 52/68 2022-08-17T12:37:32.9549280Z Verifying : 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 53/68 2022-08-17T12:37:32.9650672Z Verifying : mpfr-3.1.1-4.amzn2.0.2.x86_64 54/68 2022-08-17T12:37:32.9738875Z Verifying : trousers-0.3.14-2.amzn2.0.2.x86_64 55/68 2022-08-17T12:37:32.9828284Z Verifying : neon-0.30.0-3.amzn2.0.2.x86_64 56/68 2022-08-17T12:37:32.9928203Z Verifying : systemtap-4.5-1.amzn2.0.1.x86_64 57/68 2022-08-17T12:37:33.0015177Z Verifying : dwz-0.11-3.amzn2.0.3.x86_64 58/68 2022-08-17T12:37:33.0105889Z Verifying : systemtap-client-4.5-1.amzn2.0.1.x86_64 59/68 2022-08-17T12:37:33.0191452Z Verifying : efivar-libs-31-4.amzn2.0.4.x86_64 60/68 2022-08-17T12:37:33.0281733Z Verifying : rcs-5.9.0-5.amzn2.0.2.x86_64 61/68 2022-08-17T12:37:33.0375200Z Verifying : 1:emacs-filesystem-27.2-4.amzn2.0.1.noarch 62/68 2022-08-17T12:37:33.0452854Z Verifying : libsanitizer-7.3.1-15.amzn2.x86_64 63/68 2022-08-17T12:37:33.0542889Z Verifying : elfutils-0.176-2.amzn2.x86_64 64/68 2022-08-17T12:37:33.0648353Z Verifying : m4-1.4.16-10.amzn2.0.2.x86_64 65/68 2022-08-17T12:37:33.0752908Z Verifying : perl-XML-Parser-2.41-10.amzn2.0.2.x86_64 66/68 2022-08-17T12:37:33.0866592Z Verifying : libmodman-2.0.1-8.amzn2.0.2.x86_64 67/68 2022-08-17T12:37:33.1651366Z Verifying : zlib-devel-1.2.7-19.amzn2.0.1.x86_64 68/68 2022-08-17T12:37:33.1651621Z 2022-08-17T12:37:33.1654804Z Installed: 2022-08-17T12:37:33.1655487Z autoconf.noarch 0:2.69-11.amzn2 2022-08-17T12:37:33.1655960Z automake.noarch 0:1.13.4-3.1.amzn2 2022-08-17T12:37:33.1656403Z bison.x86_64 0:3.0.4-6.amzn2.0.2 2022-08-17T12:37:33.1657686Z byacc.x86_64 0:1.9.20130304-3.amzn2.0.2 2022-08-17T12:37:33.1658222Z cscope.x86_64 0:15.8-10.amzn2.0.2 2022-08-17T12:37:33.1658633Z ctags.x86_64 0:5.8-13.amzn2.0.2 2022-08-17T12:37:33.1659061Z diffstat.x86_64 0:1.57-4.amzn2.0.2 2022-08-17T12:37:33.1659489Z doxygen.x86_64 1:1.8.5-4.amzn2 2022-08-17T12:37:33.1659901Z elfutils.x86_64 0:0.176-2.amzn2 2022-08-17T12:37:33.1660341Z flex.x86_64 0:2.5.37-3.amzn2.0.3 2022-08-17T12:37:33.1662051Z gcc.x86_64 0:7.3.1-15.amzn2 2022-08-17T12:37:33.1662865Z gcc-c++.x86_64 0:7.3.1-15.amzn2 2022-08-17T12:37:33.1664175Z gcc-gfortran.x86_64 0:7.3.1-15.amzn2 2022-08-17T12:37:33.1664746Z indent.x86_64 0:2.2.11-13.amzn2.0.2 2022-08-17T12:37:33.1665160Z intltool.noarch 0:0.50.2-7.amzn2 2022-08-17T12:37:33.1665594Z libtool.x86_64 0:2.4.2-22.2.amzn2.0.2 2022-08-17T12:37:33.1666020Z patch.x86_64 0:2.7.1-12.amzn2.0.2 2022-08-17T12:37:33.1666452Z patchutils.x86_64 0:0.3.3-4.amzn2.0.1 2022-08-17T12:37:33.1666859Z rcs.x86_64 0:5.9.0-5.amzn2.0.2 2022-08-17T12:37:33.1667609Z rpm-build.x86_64 0:4.11.3-48.amzn2.0.2 2022-08-17T12:37:33.1668206Z rpm-sign.x86_64 0:4.11.3-48.amzn2.0.2 2022-08-17T12:37:33.1668624Z subversion.x86_64 0:1.7.14-16.amzn2.0.1 2022-08-17T12:37:33.1669052Z swig.x86_64 0:3.0.12-11.amzn2.0.3 2022-08-17T12:37:33.1669493Z system-rpm-config.noarch 0:9.1.0-76.amzn2.0.14 2022-08-17T12:37:33.1669948Z systemtap.x86_64 0:4.5-1.amzn2.0.1 2022-08-17T12:37:33.1670134Z 2022-08-17T12:37:33.1670260Z Dependency Installed: 2022-08-17T12:37:33.1670682Z apr.x86_64 0:1.7.0-9.amzn2 2022-08-17T12:37:33.1671112Z apr-util.x86_64 0:1.6.1-5.amzn2.0.2 2022-08-17T12:37:33.1671754Z apr-util-bdb.x86_64 0:1.6.1-5.amzn2.0.2 2022-08-17T12:37:33.1672175Z avahi-libs.x86_64 0:0.6.31-20.amzn2 2022-08-17T12:37:33.1672597Z cpp.x86_64 0:7.3.1-15.amzn2 2022-08-17T12:37:33.1673012Z dwz.x86_64 0:0.11-3.amzn2.0.3 2022-08-17T12:37:33.1673414Z efivar-libs.x86_64 0:31-4.amzn2.0.4 2022-08-17T12:37:33.1673869Z elfutils-libelf-devel.x86_64 0:0.176-2.amzn2 2022-08-17T12:37:33.1674332Z emacs-filesystem.noarch 1:27.2-4.amzn2.0.1 2022-08-17T12:37:33.1674772Z gdb.x86_64 0:8.0.1-36.amzn2.0.1 2022-08-17T12:37:33.1675196Z gettext-common-devel.noarch 0:0.19.8.1-3.amzn2 2022-08-17T12:37:33.1675655Z gettext-devel.x86_64 0:0.19.8.1-3.amzn2 2022-08-17T12:37:33.1676087Z glibc-devel.x86_64 0:2.26-60.amzn2 2022-08-17T12:37:33.1676494Z glibc-headers.x86_64 0:2.26-60.amzn2 2022-08-17T12:37:33.1677004Z gnutls.x86_64 0:3.3.29-9.amzn2.0.1 2022-08-17T12:37:33.1677454Z go-srpm-macros.noarch 0:3.0.15-23.amzn2.0.1 2022-08-17T12:37:33.1677903Z kernel-devel.x86_64 0:4.14.287-215.504.amzn2 2022-08-17T12:37:33.1678357Z kernel-headers.x86_64 0:4.14.287-215.504.amzn2 2022-08-17T12:37:33.1678792Z libatomic.x86_64 0:7.3.1-15.amzn2 2022-08-17T12:37:33.1679215Z libcilkrts.x86_64 0:7.3.1-15.amzn2 2022-08-17T12:37:33.1679623Z libgfortran.x86_64 0:7.3.1-15.amzn2 2022-08-17T12:37:33.1680050Z libitm.x86_64 0:7.3.1-15.amzn2 2022-08-17T12:37:33.1680465Z libmodman.x86_64 0:2.0.1-8.amzn2.0.2 2022-08-17T12:37:33.1680885Z libmpc.x86_64 0:1.0.1-3.amzn2.0.2 2022-08-17T12:37:33.1681285Z libmpx.x86_64 0:7.3.1-15.amzn2 2022-08-17T12:37:33.1681700Z libproxy.x86_64 0:0.4.11-10.amzn2.0.3 2022-08-17T12:37:33.1682122Z libquadmath.x86_64 0:7.3.1-15.amzn2 2022-08-17T12:37:33.1682529Z libsanitizer.x86_64 0:7.3.1-15.amzn2 2022-08-17T12:37:33.1682946Z m4.x86_64 0:1.4.16-10.amzn2.0.2 2022-08-17T12:37:33.1683353Z mokutil.x86_64 1:0.3.0-10.amzn2.0.1 2022-08-17T12:37:33.1683771Z mpfr.x86_64 0:3.1.1-4.amzn2.0.2 2022-08-17T12:37:33.1684161Z neon.x86_64 0:0.30.0-3.amzn2.0.2 2022-08-17T12:37:33.1684576Z pakchois.x86_64 0:0.4-10.amzn2.0.2 2022-08-17T12:37:33.1685026Z perl-Data-Dumper.x86_64 0:2.145-3.amzn2.0.2 2022-08-17T12:37:33.1685473Z perl-Test-Harness.noarch 0:3.28-3.amzn2 2022-08-17T12:37:33.1685937Z perl-Thread-Queue.noarch 0:3.02-2.amzn2 2022-08-17T12:37:33.1686404Z perl-XML-Parser.x86_64 0:2.41-10.amzn2.0.2 2022-08-17T12:37:33.1686870Z perl-srpm-macros.noarch 0:1-8.amzn2.0.1 2022-08-17T12:37:33.1687310Z subversion-libs.x86_64 0:1.7.14-16.amzn2.0.1 2022-08-17T12:37:33.1687817Z systemtap-client.x86_64 0:4.5-1.amzn2.0.1 2022-08-17T12:37:33.1688267Z systemtap-devel.x86_64 0:4.5-1.amzn2.0.1 2022-08-17T12:37:33.1688680Z trousers.x86_64 0:0.3.14-2.amzn2.0.2 2022-08-17T12:37:33.1689112Z zlib-devel.x86_64 0:1.2.7-19.amzn2.0.1 2022-08-17T12:37:33.1689319Z 2022-08-17T12:37:33.1689426Z Complete! 2022-08-17T12:37:33.2024323Z ++ uname -r 2022-08-17T12:37:33.2030520Z + sudo yum install -y 'kernel-devel-uname-r == 4.14.252-195.483.amzn2.x86_64' 2022-08-17T12:37:33.7042660Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-08-17T12:37:33.7201149Z Existing lock /var/run/yum.pid: another copy is running as pid 33362. 2022-08-17T12:37:33.7201954Z Another app is currently holding the yum lock; waiting for it to exit... 2022-08-17T12:37:33.7209500Z The other application is: yum 2022-08-17T12:37:33.7209936Z Memory : 96 M RSS (314 MB VSZ) 2022-08-17T12:37:33.7210777Z Started: Wed Aug 17 12:37:31 2022 - 00:02 ago 2022-08-17T12:37:33.7211222Z State : Running, pid: 33362 2022-08-17T12:37:35.7236850Z Another app is currently holding the yum lock; waiting for it to exit... 2022-08-17T12:37:35.7242752Z The other application is: yum 2022-08-17T12:37:35.7243478Z Memory : 158 M RSS (376 MB VSZ) 2022-08-17T12:37:35.7244179Z Started: Wed Aug 17 12:37:31 2022 - 00:04 ago 2022-08-17T12:37:35.7244589Z State : Running, pid: 33362 2022-08-17T12:37:38.9835247Z Resolving Dependencies 2022-08-17T12:37:38.9841627Z --> Running transaction check 2022-08-17T12:37:38.9842093Z ---> Package kernel-devel.x86_64 0:4.14.252-195.483.amzn2 will be installed 2022-08-17T12:37:39.3256297Z --> Finished Dependency Resolution 2022-08-17T12:37:39.4429798Z 2022-08-17T12:37:39.4430279Z Dependencies Resolved 2022-08-17T12:37:39.4435103Z 2022-08-17T12:37:39.4435261Z ================================================================================ 2022-08-17T12:37:39.4435659Z Package Arch Version Repository Size 2022-08-17T12:37:39.4436021Z ================================================================================ 2022-08-17T12:37:39.4436299Z Installing: 2022-08-17T12:37:39.4436790Z kernel-devel x86_64 4.14.252-195.483.amzn2 amzn2-core 13 M 2022-08-17T12:37:39.4437011Z 2022-08-17T12:37:39.4437128Z Transaction Summary 2022-08-17T12:37:39.4437416Z ================================================================================ 2022-08-17T12:37:39.4437670Z Install 1 Package 2022-08-17T12:37:39.4437835Z 2022-08-17T12:37:39.4437959Z Total download size: 13 M 2022-08-17T12:37:39.4438222Z Installed size: 60 M 2022-08-17T12:37:39.4438492Z Downloading packages: 2022-08-17T12:37:39.4447313Z Delta RPMs disabled because /usr/bin/applydeltarpm not installed. 2022-08-17T12:37:39.7397032Z Running transaction check 2022-08-17T12:37:39.7582566Z Running transaction test 2022-08-17T12:37:40.1644916Z Transaction test succeeded 2022-08-17T12:37:40.1647902Z Running transaction 2022-08-17T12:37:55.5333480Z Installing : kernel-devel-4.14.252-195.483.amzn2.x86_64 1/1 2022-08-17T12:37:55.6149129Z Verifying : kernel-devel-4.14.252-195.483.amzn2.x86_64 1/1 2022-08-17T12:37:55.6149431Z 2022-08-17T12:37:55.6149561Z Installed: 2022-08-17T12:37:55.6149944Z kernel-devel.x86_64 0:4.14.252-195.483.amzn2 2022-08-17T12:37:55.6150155Z 2022-08-17T12:37:55.6150270Z Complete! 2022-08-17T12:37:55.6491546Z + sudo modprobe backlight 2022-08-17T12:37:55.6677815Z + sudo curl -fsL -o /tmp/nvidia_driver https://s3.amazonaws.com/ossci-linux/nvidia_driver/NVIDIA-Linux-x86_64-515.57.run 2022-08-17T12:37:59.3333782Z + sudo /bin/bash /tmp/nvidia_driver -s --no-drm 2022-08-17T12:38:00.6949212Z Verifying archive integrity... OK 2022-08-17T12:38:27.2110655Z Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86_64 515.57................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ 2022-08-17T12:38:27.3525683Z 2022-08-17T12:38:27.3528274Z WARNING: The nvidia-drm module will not be installed. As a result, DRM-KMS will not function with this installation of the NVIDIA driver. 2022-08-17T12:38:27.3528639Z 2022-08-17T12:38:42.9425043Z 2022-08-17T12:38:42.9426774Z WARNING: nvidia-installer was forced to guess the X library path '/usr/lib64' and X module path '/usr/lib64/xorg/modules'; these paths were not queryable from the system. If X fails to find the NVIDIA X driver module, please install the `pkg-config` utility and the X.Org SDK/development package for your distribution and reinstall the driver. 2022-08-17T12:38:42.9427425Z 2022-08-17T12:38:51.9884975Z + sudo rm -fv /tmp/nvidia_driver 2022-08-17T12:38:52.0882092Z removed ‘/tmp/nvidia_driver’ 2022-08-17T12:38:52.0895851Z + nvidia-smi 2022-08-17T12:38:56.4365986Z Wed Aug 17 12:38:56 2022 2022-08-17T12:38:56.4366585Z +-----------------------------------------------------------------------------+ 2022-08-17T12:38:56.4368952Z | NVIDIA-SMI 515.57 Driver Version: 515.57 CUDA Version: 11.7 | 2022-08-17T12:38:56.4369506Z |-------------------------------+----------------------+----------------------+ 2022-08-17T12:38:56.4370037Z | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | 2022-08-17T12:38:56.4370544Z | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | 2022-08-17T12:38:56.4370905Z | | | MIG M. | 2022-08-17T12:38:56.4371216Z |===============================+======================+======================| 2022-08-17T12:38:56.4411743Z | 0 Tesla M60 Off | 00000000:00:1D.0 Off | 4253925446 | 2022-08-17T12:38:56.4412127Z | N/A 27C P0 38W / 150W | 0MiB / 7680MiB | 0% Default | 2022-08-17T12:38:56.4412456Z | | | N/A | 2022-08-17T12:38:56.4412961Z +-------------------------------+----------------------+----------------------+ 2022-08-17T12:38:56.4457168Z | 1 Tesla M60 Off | 00000000:00:1E.0 Off | 3939052238 | 2022-08-17T12:38:56.4457533Z | N/A 34C P0 38W / 150W | 0MiB / 7680MiB | 95% Default | 2022-08-17T12:38:56.4457852Z | | | N/A | 2022-08-17T12:38:56.4458335Z +-------------------------------+----------------------+----------------------+ 2022-08-17T12:38:56.4467728Z 2022-08-17T12:38:56.4468220Z +-----------------------------------------------------------------------------+ 2022-08-17T12:38:56.4468609Z | Processes: | 2022-08-17T12:38:56.4468966Z | GPU GI CI PID Type Process name GPU Memory | 2022-08-17T12:38:56.4469304Z | ID ID Usage | 2022-08-17T12:38:56.4469607Z |=============================================================================| 2022-08-17T12:38:56.4470202Z | No running processes found | 2022-08-17T12:38:56.4470658Z +-----------------------------------------------------------------------------+ 2022-08-17T12:38:56.9545863Z + echo 'GPU_FLAG=--gpus all' 2022-08-17T12:38:57.2428370Z Command completed after 1 attempt(s). 2022-08-17T12:38:57.2428797Z 2022-08-17T12:38:57.2481226Z ##[group]Run python3 -m pip install psutil==5.9.1 2022-08-17T12:38:57.2481620Z python3 -m pip install psutil==5.9.1 2022-08-17T12:38:57.2481955Z python3 -m pip install pynvml==11.4.1 2022-08-17T12:38:57.2482313Z python3 -m tools.stats.monitor > usage_log.txt 2>&1 & 2022-08-17T12:38:57.2482679Z echo "::set-output name=monitor-script-pid::${!}" 2022-08-17T12:38:57.2496193Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-08-17T12:38:57.2496492Z env: 2022-08-17T12:38:57.2496717Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:38:57.2497001Z GPU_FLAG: --gpus all 2022-08-17T12:38:57.2497257Z ##[endgroup] 2022-08-17T12:38:58.2979405Z Defaulting to user installation because normal site-packages is not writeable 2022-08-17T12:38:58.6481128Z Collecting psutil==5.9.1 2022-08-17T12:38:58.6662043Z Downloading psutil-5.9.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (281 kB) 2022-08-17T12:38:58.7371046Z Installing collected packages: psutil 2022-08-17T12:38:58.8930452Z Successfully installed psutil-5.9.1 2022-08-17T12:38:59.3706683Z Defaulting to user installation because normal site-packages is not writeable 2022-08-17T12:38:59.4391191Z Collecting pynvml==11.4.1 2022-08-17T12:38:59.4545932Z Downloading pynvml-11.4.1-py3-none-any.whl (46 kB) 2022-08-17T12:38:59.5031333Z Installing collected packages: pynvml 2022-08-17T12:38:59.5555251Z Successfully installed pynvml-11.4.1 2022-08-17T12:38:59.6086467Z Prepare all required actions 2022-08-17T12:38:59.6086854Z Getting action download info 2022-08-17T12:38:59.7616513Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:ada9688bc02703b63dc0e606da280613803449a5) 2022-08-17T12:39:00.0555418Z Download action repository 'actions/download-artifact@v2' (SHA:f023be2c48cc18debc3bacd34cb396e0295e2869) 2022-08-17T12:39:00.2192741Z ##[group]Run ./.github/actions/download-build-artifacts 2022-08-17T12:39:00.2193036Z with: 2022-08-17T12:39:00.2193320Z name: linux-bionic-cuda11.6-py3.10-gcc7 2022-08-17T12:39:00.2193599Z env: 2022-08-17T12:39:00.2193819Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:39:00.2194085Z GPU_FLAG: --gpus all 2022-08-17T12:39:00.2194333Z ##[endgroup] 2022-08-17T12:39:00.2228006Z ##[group]Run seemethere/download-artifact-s3@v4 2022-08-17T12:39:00.2228297Z with: 2022-08-17T12:39:00.2228560Z name: linux-bionic-cuda11.6-py3.10-gcc7 2022-08-17T12:39:00.2228865Z s3-bucket: gha-artifacts 2022-08-17T12:39:00.2229189Z region: us-east-1 2022-08-17T12:39:00.2229406Z env: 2022-08-17T12:39:00.2229656Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:39:00.2229923Z GPU_FLAG: --gpus all 2022-08-17T12:39:00.2230153Z ##[endgroup] 2022-08-17T12:39:00.7281868Z Found 1 objects with prefix pytorch/pytorch/2875102080/linux-bionic-cuda11.6-py3.10-gcc7/ 2022-08-17T12:39:00.7282493Z Starting download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2022-08-17T12:39:06.9103393Z Finished download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2022-08-17T12:39:06.9103751Z 2022-08-17T12:39:06.9105645Z Artifact download has finished successfully 2022-08-17T12:39:06.9239445Z ##[group]Run unzip -o artifacts.zip 2022-08-17T12:39:06.9239762Z unzip -o artifacts.zip 2022-08-17T12:39:06.9253118Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-08-17T12:39:06.9253417Z env: 2022-08-17T12:39:06.9253658Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:39:06.9253927Z GPU_FLAG: --gpus all 2022-08-17T12:39:06.9254182Z ##[endgroup] 2022-08-17T12:39:06.9344398Z Archive: artifacts.zip 2022-08-17T12:39:06.9346186Z creating: dist/ 2022-08-17T12:39:08.9575400Z inflating: dist/torch-1.13.0a0+gitce6a3c6-cp310-cp310-linux_x86_64.whl 2022-08-17T12:39:08.9575821Z creating: build/custom_test_artifacts/ 2022-08-17T12:39:08.9576247Z creating: build/custom_test_artifacts/custom-op-build/ 2022-08-17T12:39:08.9576707Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2022-08-17T12:39:08.9583894Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeOutput.log 2022-08-17T12:39:08.9584439Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/ 2022-08-17T12:39:08.9585366Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeSystem.cmake 2022-08-17T12:39:08.9585963Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/ 2022-08-17T12:39:08.9586552Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/tmp/ 2022-08-17T12:39:08.9588626Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/CMakeCCompilerId.c 2022-08-17T12:39:08.9589764Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/a.out 2022-08-17T12:39:08.9590336Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/ 2022-08-17T12:39:08.9590900Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/tmp/ 2022-08-17T12:39:08.9593643Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/CMakeCXXCompilerId.cpp 2022-08-17T12:39:08.9594795Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/a.out 2022-08-17T12:39:08.9596587Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_C.bin 2022-08-17T12:39:08.9597405Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeCCompiler.cmake 2022-08-17T12:39:08.9598687Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CXX.bin 2022-08-17T12:39:08.9599705Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeCXXCompiler.cmake 2022-08-17T12:39:08.9600300Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/ 2022-08-17T12:39:08.9600864Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/ 2022-08-17T12:39:08.9654977Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2022-08-17T12:39:08.9655702Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2022-08-17T12:39:08.9656413Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2022-08-17T12:39:08.9657155Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2022-08-17T12:39:08.9657884Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2022-08-17T12:39:08.9658576Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2022-08-17T12:39:08.9659260Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2022-08-17T12:39:08.9659958Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2022-08-17T12:39:08.9660851Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2022-08-17T12:39:08.9702797Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2022-08-17T12:39:08.9744016Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2022-08-17T12:39:08.9745173Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2022-08-17T12:39:08.9745969Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2022-08-17T12:39:08.9746590Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.reg.c 2022-08-17T12:39:08.9747222Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin 2022-08-17T12:39:08.9748173Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2022-08-17T12:39:08.9749104Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.o 2022-08-17T12:39:08.9750998Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/CMakeCUDACompilerId.cu 2022-08-17T12:39:08.9823800Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/a.out 2022-08-17T12:39:08.9896509Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CUDA.bin 2022-08-17T12:39:08.9897149Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeCUDACompiler.cmake 2022-08-17T12:39:08.9897707Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2022-08-17T12:39:08.9898352Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeError.log 2022-08-17T12:39:08.9898905Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2022-08-17T12:39:08.9899570Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2022-08-17T12:39:08.9900180Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2022-08-17T12:39:08.9900783Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2022-08-17T12:39:08.9901390Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2022-08-17T12:39:08.9901967Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2022-08-17T12:39:08.9902558Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2022-08-17T12:39:08.9903802Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2022-08-17T12:39:08.9904525Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2022-08-17T12:39:08.9905141Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2022-08-17T12:39:08.9905745Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2022-08-17T12:39:08.9926326Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2022-08-17T12:39:09.0039371Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2022-08-17T12:39:09.0039938Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2022-08-17T12:39:09.0040538Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2022-08-17T12:39:09.0041180Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2022-08-17T12:39:09.0041798Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2022-08-17T12:39:09.0042399Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2022-08-17T12:39:09.0043187Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2022-08-17T12:39:09.0044167Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2022-08-17T12:39:09.0044827Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2022-08-17T12:39:09.0045442Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2022-08-17T12:39:09.0046025Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2022-08-17T12:39:09.0066506Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2022-08-17T12:39:09.0147371Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2022-08-17T12:39:09.0148336Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2022-08-17T12:39:09.0149264Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2022-08-17T12:39:09.0149937Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2022-08-17T12:39:09.0150479Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2022-08-17T12:39:09.0151160Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2022-08-17T12:39:09.0151681Z inflating: build/custom_test_artifacts/custom-op-build/detect_cuda_version.cc 2022-08-17T12:39:09.0154331Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2022-08-17T12:39:09.0155102Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2022-08-17T12:39:09.0155787Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2022-08-17T12:39:09.0247774Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2022-08-17T12:39:09.0309087Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2022-08-17T12:39:09.0309567Z creating: build/custom_test_artifacts/jit-hook-build/ 2022-08-17T12:39:09.0310026Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2022-08-17T12:39:09.0316593Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeOutput.log 2022-08-17T12:39:09.0317113Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/ 2022-08-17T12:39:09.0317667Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeSystem.cmake 2022-08-17T12:39:09.0318217Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/ 2022-08-17T12:39:09.0318760Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/tmp/ 2022-08-17T12:39:09.0321056Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/CMakeCCompilerId.c 2022-08-17T12:39:09.0322164Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/a.out 2022-08-17T12:39:09.0323007Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/ 2022-08-17T12:39:09.0323846Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/tmp/ 2022-08-17T12:39:09.0325999Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/CMakeCXXCompilerId.cpp 2022-08-17T12:39:09.0327045Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/a.out 2022-08-17T12:39:09.0328830Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_C.bin 2022-08-17T12:39:09.0329446Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeCCompiler.cmake 2022-08-17T12:39:09.0331072Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CXX.bin 2022-08-17T12:39:09.0331997Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeCXXCompiler.cmake 2022-08-17T12:39:09.0332574Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/ 2022-08-17T12:39:09.0333138Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/ 2022-08-17T12:39:09.0387391Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2022-08-17T12:39:09.0388091Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2022-08-17T12:39:09.0388801Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2022-08-17T12:39:09.0389532Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2022-08-17T12:39:09.0390243Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2022-08-17T12:39:09.0390921Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2022-08-17T12:39:09.0391614Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2022-08-17T12:39:09.0392298Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2022-08-17T12:39:09.0393246Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2022-08-17T12:39:09.0435074Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2022-08-17T12:39:09.0476300Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2022-08-17T12:39:09.0477247Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2022-08-17T12:39:09.0478067Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2022-08-17T12:39:09.0478706Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.reg.c 2022-08-17T12:39:09.0479316Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin 2022-08-17T12:39:09.0480171Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2022-08-17T12:39:09.0481082Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.o 2022-08-17T12:39:09.0483000Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/CMakeCUDACompilerId.cu 2022-08-17T12:39:09.0555841Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/a.out 2022-08-17T12:39:09.0628634Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CUDA.bin 2022-08-17T12:39:09.0629264Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeCUDACompiler.cmake 2022-08-17T12:39:09.0629815Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2022-08-17T12:39:09.0630571Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeError.log 2022-08-17T12:39:09.0631217Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2022-08-17T12:39:09.0631769Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2022-08-17T12:39:09.0632364Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2022-08-17T12:39:09.0632973Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2022-08-17T12:39:09.0633794Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2022-08-17T12:39:09.0634640Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2022-08-17T12:39:09.0635263Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2022-08-17T12:39:09.0636183Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2022-08-17T12:39:09.0636881Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2022-08-17T12:39:09.0637659Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2022-08-17T12:39:09.0638358Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2022-08-17T12:39:09.0658382Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2022-08-17T12:39:09.0721680Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2022-08-17T12:39:09.0722467Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2022-08-17T12:39:09.0723176Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2022-08-17T12:39:09.0723764Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2022-08-17T12:39:09.0724382Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2022-08-17T12:39:09.0725174Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2022-08-17T12:39:09.0725856Z inflating: build/custom_test_artifacts/jit-hook-build/detect_cuda_version.cc 2022-08-17T12:39:09.0728739Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2022-08-17T12:39:09.0729348Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2022-08-17T12:39:09.0730100Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2022-08-17T12:39:09.0779290Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2022-08-17T12:39:09.0779863Z creating: build/custom_test_artifacts/custom-backend-build/ 2022-08-17T12:39:09.0780378Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2022-08-17T12:39:09.0787262Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeOutput.log 2022-08-17T12:39:09.0787978Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/ 2022-08-17T12:39:09.0788654Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeSystem.cmake 2022-08-17T12:39:09.0789263Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/ 2022-08-17T12:39:09.0789916Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/tmp/ 2022-08-17T12:39:09.0791288Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/CMakeCCompilerId.c 2022-08-17T12:39:09.0792992Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/a.out 2022-08-17T12:39:09.0793619Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/ 2022-08-17T12:39:09.0794340Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/tmp/ 2022-08-17T12:39:09.0796891Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/CMakeCXXCompilerId.cpp 2022-08-17T12:39:09.0797852Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/a.out 2022-08-17T12:39:09.0799219Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_C.bin 2022-08-17T12:39:09.0800080Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeCCompiler.cmake 2022-08-17T12:39:09.0801580Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CXX.bin 2022-08-17T12:39:09.0802672Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeCXXCompiler.cmake 2022-08-17T12:39:09.0803317Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/ 2022-08-17T12:39:09.0803986Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/ 2022-08-17T12:39:09.0858201Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2022-08-17T12:39:09.0859038Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2022-08-17T12:39:09.0859928Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2022-08-17T12:39:09.0860716Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2022-08-17T12:39:09.0861538Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2022-08-17T12:39:09.0862370Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2022-08-17T12:39:09.0863206Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2022-08-17T12:39:09.0864162Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2022-08-17T12:39:09.0865077Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2022-08-17T12:39:09.0905949Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2022-08-17T12:39:09.0947171Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2022-08-17T12:39:09.0948218Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2022-08-17T12:39:09.0948938Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2022-08-17T12:39:09.0949659Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.reg.c 2022-08-17T12:39:09.0950477Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin 2022-08-17T12:39:09.0951214Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2022-08-17T12:39:09.0951896Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.o 2022-08-17T12:39:09.0953383Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/CMakeCUDACompilerId.cu 2022-08-17T12:39:09.1026700Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/a.out 2022-08-17T12:39:09.1099551Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CUDA.bin 2022-08-17T12:39:09.1100246Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeCUDACompiler.cmake 2022-08-17T12:39:09.1100901Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2022-08-17T12:39:09.1101573Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeError.log 2022-08-17T12:39:09.1102429Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2022-08-17T12:39:09.1103033Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2022-08-17T12:39:09.1103965Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2022-08-17T12:39:09.1104691Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2022-08-17T12:39:09.1105452Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2022-08-17T12:39:09.1106105Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2022-08-17T12:39:09.1143245Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2022-08-17T12:39:09.1144361Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2022-08-17T12:39:09.1145123Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2022-08-17T12:39:09.1145777Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2022-08-17T12:39:09.1146398Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2022-08-17T12:39:09.1147065Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2022-08-17T12:39:09.1259961Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2022-08-17T12:39:09.1260576Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2022-08-17T12:39:09.1261380Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2022-08-17T12:39:09.1262056Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2022-08-17T12:39:09.1262713Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2022-08-17T12:39:09.1263562Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2022-08-17T12:39:09.1264202Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2022-08-17T12:39:09.1264831Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2022-08-17T12:39:09.1265458Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2022-08-17T12:39:09.1266090Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2022-08-17T12:39:09.1266715Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2022-08-17T12:39:09.1286761Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2022-08-17T12:39:09.1344575Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2022-08-17T12:39:09.1345251Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2022-08-17T12:39:09.1345867Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2022-08-17T12:39:09.1346461Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2022-08-17T12:39:09.1347164Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2022-08-17T12:39:09.1348433Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2022-08-17T12:39:09.1349162Z inflating: build/custom_test_artifacts/custom-backend-build/detect_cuda_version.cc 2022-08-17T12:39:09.1351872Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2022-08-17T12:39:09.1352682Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2022-08-17T12:39:09.1353590Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2022-08-17T12:39:09.1471637Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2022-08-17T12:39:09.1516734Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2022-08-17T12:39:09.1517102Z creating: build/lib/ 2022-08-17T12:39:09.1517743Z inflating: build/lib/libclog.a 2022-08-17T12:39:09.1583839Z inflating: build/lib/libgtest.a 2022-08-17T12:39:09.1594143Z inflating: build/lib/libpthreadpool.a 2022-08-17T12:39:09.1603212Z inflating: build/lib/libittnotify.a 2022-08-17T12:39:09.1709653Z inflating: build/lib/libprotobuf-lite.a 2022-08-17T12:39:09.1800925Z inflating: build/lib/libbenchmark.a 2022-08-17T12:39:09.1833008Z inflating: build/lib/libtensorpipe_uv.a 2022-08-17T12:39:09.1909657Z inflating: build/lib/libasmjit.a 2022-08-17T12:39:09.2042392Z inflating: build/lib/libgloo.a 2022-08-17T12:39:09.2576755Z inflating: build/lib/libprotobuf.a 2022-08-17T12:39:09.2596649Z inflating: build/lib/libfmt.a 2022-08-17T12:39:09.2598465Z inflating: build/lib/libcaffe2_nvrtc.so 2022-08-17T12:39:09.2599018Z inflating: build/lib/libfoxi_loader.a 2022-08-17T12:39:09.2669431Z inflating: build/lib/libc10.so 2022-08-17T12:39:09.2670478Z inflating: build/lib/libtorch_global_deps.so 2022-08-17T12:39:09.2680558Z inflating: build/lib/libcpuinfo.a 2022-08-17T12:39:09.2689397Z inflating: build/lib/libcpuinfo_internals.a 2022-08-17T12:39:09.2705688Z inflating: build/lib/libqnnpack.a 2022-08-17T12:39:09.2729368Z inflating: build/lib/libpytorch_qnnpack.a 2022-08-17T12:39:09.2731925Z inflating: build/lib/libnnpack_reference_layers.a 2022-08-17T12:39:09.3302327Z inflating: build/lib/libprotoc.a 2022-08-17T12:39:09.3322109Z inflating: build/lib/libgmock.a 2022-08-17T12:39:09.3322795Z inflating: build/lib/libgtest_main.a 2022-08-17T12:39:09.3323646Z inflating: build/lib/libbenchmark_main.a 2022-08-17T12:39:09.3346204Z inflating: build/lib/libnnpack.a 2022-08-17T12:39:10.1467160Z inflating: build/lib/libdnnl.a 2022-08-17T12:39:10.2122362Z inflating: build/lib/libtensorpipe.a 2022-08-17T12:39:10.2167099Z inflating: build/lib/libc10_cuda.so 2022-08-17T12:39:10.2308550Z inflating: build/lib/libXNNPACK.a 2022-08-17T12:39:10.3835601Z inflating: build/lib/libfbgemm.a 2022-08-17T12:39:10.3836186Z inflating: build/lib/libgmock_main.a 2022-08-17T12:39:10.4258299Z inflating: build/lib/libkineto.a 2022-08-17T12:39:10.5388160Z inflating: build/lib/libdnnl_graph.a 2022-08-17T12:39:10.5678095Z inflating: build/lib/libtensorpipe_cuda.a 2022-08-17T12:39:10.5723488Z inflating: build/lib/libcaffe2_protos.a 2022-08-17T12:39:10.5771366Z inflating: build/lib/libonnx_proto.a 2022-08-17T12:39:10.6449775Z inflating: build/lib/libonnx.a 2022-08-17T12:39:12.9106273Z inflating: build/lib/libtorch_cpu.so 2022-08-17T12:39:12.9534744Z inflating: build/lib/libgloo_cuda.a 2022-08-17T12:39:13.2793400Z inflating: build/lib/libtorch_cuda_cpp.so 2022-08-17T12:39:15.0036930Z inflating: build/lib/libtorch_cuda_cu.so 2022-08-17T12:39:15.0037548Z inflating: build/lib/libtorch_cuda.so 2022-08-17T12:39:15.0039314Z inflating: build/lib/libtorch.so 2022-08-17T12:39:15.0042860Z inflating: build/lib/libc10d_cuda_test.so 2022-08-17T12:39:16.1367023Z inflating: build/lib/libtorch_cuda_linalg.so 2022-08-17T12:39:16.1420033Z inflating: build/lib/libtorchbind_test.so 2022-08-17T12:39:16.1443898Z inflating: build/lib/libjitbackend_test.so 2022-08-17T12:39:16.1474428Z inflating: build/lib/libbackend_with_compiler.so 2022-08-17T12:39:16.1479171Z inflating: build/lib/libshm.so 2022-08-17T12:39:16.3191677Z inflating: build/lib/libtorch_python.so 2022-08-17T12:39:16.3230844Z inflating: build/lib/libnnapi_backend.so 2022-08-17T12:39:16.3231157Z creating: build/bin/ 2022-08-17T12:39:16.3283499Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2022-08-17T12:39:16.3338629Z inflating: build/bin/c10_DeviceGuard_test 2022-08-17T12:39:16.3392319Z inflating: build/bin/c10_Device_test 2022-08-17T12:39:16.3455293Z inflating: build/bin/c10_DispatchKeySet_test 2022-08-17T12:39:16.3506540Z inflating: build/bin/c10_StreamGuard_test 2022-08-17T12:39:16.3559945Z inflating: build/bin/c10_SymInt_test 2022-08-17T12:39:16.3619205Z inflating: build/bin/c10_InlineDeviceGuard_test 2022-08-17T12:39:16.3678908Z inflating: build/bin/c10_InlineStreamGuard_test 2022-08-17T12:39:16.3740812Z inflating: build/bin/c10_SizesAndStrides_test 2022-08-17T12:39:16.3792085Z inflating: build/bin/c10_Array_test 2022-08-17T12:39:16.3848618Z inflating: build/bin/c10_Bitset_test 2022-08-17T12:39:16.6495635Z inflating: build/bin/c10_C++17_test 2022-08-17T12:39:16.6547026Z inflating: build/bin/c10_ConstexprCrc_test 2022-08-17T12:39:16.6599825Z inflating: build/bin/c10_Half_test 2022-08-17T12:39:16.6652141Z inflating: build/bin/c10_DeadlockDetection_test 2022-08-17T12:39:16.6714103Z inflating: build/bin/c10_LeftRight_test 2022-08-17T12:39:16.6781218Z inflating: build/bin/c10_Metaprogramming_test 2022-08-17T12:39:16.6834799Z inflating: build/bin/c10_Synchronized_test 2022-08-17T12:39:16.6990434Z inflating: build/bin/c10_SmallVectorTest 2022-08-17T12:39:16.7046358Z inflating: build/bin/c10_TypeIndex_test 2022-08-17T12:39:16.7107576Z inflating: build/bin/c10_ThreadLocal_test 2022-08-17T12:39:16.7161364Z inflating: build/bin/c10_TypeList_test 2022-08-17T12:39:16.7212625Z inflating: build/bin/c10_TypeTraits_test 2022-08-17T12:39:16.7267542Z inflating: build/bin/c10_accumulate_test 2022-08-17T12:39:16.7327135Z inflating: build/bin/c10_bfloat16_test 2022-08-17T12:39:16.7384692Z inflating: build/bin/c10_complex_math_test 2022-08-17T12:39:16.7443897Z inflating: build/bin/c10_complex_test 2022-08-17T12:39:16.7561494Z inflating: build/bin/c10_either_test 2022-08-17T12:39:16.7617247Z inflating: build/bin/c10_exception_test 2022-08-17T12:39:16.7670653Z inflating: build/bin/c10_flags_test 2022-08-17T12:39:16.7853079Z inflating: build/bin/c10_intrusive_ptr_test 2022-08-17T12:39:16.7906798Z inflating: build/bin/c10_irange_test 2022-08-17T12:39:16.7968361Z inflating: build/bin/c10_logging_test 2022-08-17T12:39:16.8034625Z inflating: build/bin/c10_ordered_preserving_dict_test 2022-08-17T12:39:16.8114494Z inflating: build/bin/c10_optional_test 2022-08-17T12:39:16.8172544Z inflating: build/bin/c10_registry_test 2022-08-17T12:39:16.8227539Z inflating: build/bin/c10_tempfile_test 2022-08-17T12:39:16.8290713Z inflating: build/bin/c10_string_view_test 2022-08-17T12:39:16.8351079Z inflating: build/bin/c10_typeid_test 2022-08-17T12:39:16.8411377Z inflating: build/bin/c10_intrusive_ptr_benchmark 2022-08-17T12:39:16.8933093Z inflating: build/bin/protoc-3.13.0.0 2022-08-17T12:39:16.9454229Z inflating: build/bin/protoc 2022-08-17T12:39:16.9506054Z inflating: build/bin/c10_cuda_CUDATest 2022-08-17T12:39:16.9823305Z inflating: build/bin/vec_test_all_types_DEFAULT 2022-08-17T12:39:17.0177292Z inflating: build/bin/vec_test_all_types_AVX2 2022-08-17T12:39:17.0234536Z inflating: build/bin/HashStoreTest 2022-08-17T12:39:17.0291712Z inflating: build/bin/FileStoreTest 2022-08-17T12:39:17.0356501Z inflating: build/bin/TCPStoreTest 2022-08-17T12:39:17.0372126Z inflating: build/bin/ProcessGroupMPITest 2022-08-17T12:39:17.0375166Z inflating: build/bin/example_allreduce 2022-08-17T12:39:17.0436534Z inflating: build/bin/scalar_test 2022-08-17T12:39:17.0500821Z inflating: build/bin/basic 2022-08-17T12:39:17.0564382Z inflating: build/bin/apply_utils_test 2022-08-17T12:39:17.0620246Z inflating: build/bin/Dimname_test 2022-08-17T12:39:17.0682988Z inflating: build/bin/atest 2022-08-17T12:39:17.0761286Z inflating: build/bin/Dict_test 2022-08-17T12:39:17.0822608Z inflating: build/bin/NamedTensor_test 2022-08-17T12:39:17.0882274Z inflating: build/bin/half_test 2022-08-17T12:39:17.0939844Z inflating: build/bin/broadcast_test 2022-08-17T12:39:17.0995023Z inflating: build/bin/wrapdim_test 2022-08-17T12:39:17.1047964Z inflating: build/bin/dlconvertor_test 2022-08-17T12:39:17.1107707Z inflating: build/bin/native_test 2022-08-17T12:39:17.1167785Z inflating: build/bin/scalar_tensor_test 2022-08-17T12:39:17.1223306Z inflating: build/bin/undefined_tensor_test 2022-08-17T12:39:17.1282847Z inflating: build/bin/test_parallel 2022-08-17T12:39:17.1284154Z inflating: build/bin/verify_api_visibility 2022-08-17T12:39:17.1286886Z inflating: build/bin/thread_init_test 2022-08-17T12:39:17.1341376Z inflating: build/bin/weakref_test 2022-08-17T12:39:17.1403053Z inflating: build/bin/quantized_test 2022-08-17T12:39:17.1463726Z inflating: build/bin/extension_backend_test 2022-08-17T12:39:17.1517013Z inflating: build/bin/operators_test 2022-08-17T12:39:17.1569438Z inflating: build/bin/lazy_tensor_test 2022-08-17T12:39:17.1653091Z inflating: build/bin/tensor_iterator_test 2022-08-17T12:39:17.1709445Z inflating: build/bin/math_kernel_test 2022-08-17T12:39:17.1764236Z inflating: build/bin/memory_overlapping_test 2022-08-17T12:39:17.1826834Z inflating: build/bin/cpu_generator_test 2022-08-17T12:39:17.1882320Z inflating: build/bin/mobile_memory_cleanup 2022-08-17T12:39:17.1938006Z inflating: build/bin/cpu_profiling_allocator_test 2022-08-17T12:39:17.1990812Z inflating: build/bin/variant_test 2022-08-17T12:39:17.2060291Z inflating: build/bin/pow_test 2022-08-17T12:39:17.2112859Z inflating: build/bin/reduce_ops_test 2022-08-17T12:39:17.2167005Z inflating: build/bin/reportMemoryUsage_test 2022-08-17T12:39:17.2222608Z inflating: build/bin/memory_format_test 2022-08-17T12:39:17.2316721Z inflating: build/bin/cpu_rng_test 2022-08-17T12:39:17.2417450Z inflating: build/bin/ivalue_test 2022-08-17T12:39:17.2490725Z inflating: build/bin/vmap_test 2022-08-17T12:39:17.2545685Z inflating: build/bin/stride_properties_test 2022-08-17T12:39:17.2610200Z inflating: build/bin/type_test 2022-08-17T12:39:17.2663333Z inflating: build/bin/dispatch_key_set_test 2022-08-17T12:39:17.2727148Z inflating: build/bin/IListRef_test 2022-08-17T12:39:17.2845319Z inflating: build/bin/List_test 2022-08-17T12:39:17.2973805Z inflating: build/bin/kernel_function_legacy_test 2022-08-17T12:39:17.3077648Z inflating: build/bin/kernel_function_test 2022-08-17T12:39:17.3147234Z inflating: build/bin/KernelFunction_test 2022-08-17T12:39:17.3282919Z inflating: build/bin/kernel_lambda_legacy_test 2022-08-17T12:39:17.3393976Z inflating: build/bin/kernel_lambda_test 2022-08-17T12:39:17.3459058Z inflating: build/bin/kernel_stackbased_test 2022-08-17T12:39:17.3562378Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2022-08-17T12:39:17.3617057Z inflating: build/bin/CppSignature_test 2022-08-17T12:39:17.3669330Z inflating: build/bin/op_allowlist_test 2022-08-17T12:39:17.3726609Z inflating: build/bin/inline_container_test 2022-08-17T12:39:17.4033469Z inflating: build/bin/op_registration_test 2022-08-17T12:39:17.4094682Z inflating: build/bin/backend_fallback_test 2022-08-17T12:39:17.4152690Z inflating: build/bin/cuda_caching_host_allocator_test 2022-08-17T12:39:17.4219266Z inflating: build/bin/cuda_atomic_ops_test 2022-08-17T12:39:17.4275903Z inflating: build/bin/cuda_apply_test 2022-08-17T12:39:17.4349965Z inflating: build/bin/cuda_complex_math_test 2022-08-17T12:39:17.4413731Z inflating: build/bin/cuda_complex_test 2022-08-17T12:39:17.4469303Z inflating: build/bin/cuda_integer_divider_test 2022-08-17T12:39:17.4522893Z inflating: build/bin/cuda_device_test 2022-08-17T12:39:17.4589431Z inflating: build/bin/cuda_stream_test 2022-08-17T12:39:17.4646215Z inflating: build/bin/cuda_reportMemoryUsage_test 2022-08-17T12:39:17.4700116Z inflating: build/bin/cuda_half_test 2022-08-17T12:39:17.4772883Z inflating: build/bin/cuda_distributions_test 2022-08-17T12:39:17.4825766Z inflating: build/bin/cuda_optional_test 2022-08-17T12:39:17.4881862Z inflating: build/bin/cuda_packedtensoraccessor_test 2022-08-17T12:39:17.4939308Z inflating: build/bin/cuda_vectorized_test 2022-08-17T12:39:17.4993514Z inflating: build/bin/cuda_dlconvertor_test 2022-08-17T12:39:17.5046198Z inflating: build/bin/cuda_cudnn_test 2022-08-17T12:39:17.5111434Z inflating: build/bin/cuda_cub_test 2022-08-17T12:39:17.5175130Z inflating: build/bin/cuda_generator_test 2022-08-17T12:39:17.5193270Z inflating: build/bin/tutorial_tensorexpr 2022-08-17T12:39:17.5263279Z inflating: build/bin/ProcessGroupGlooTest 2022-08-17T12:39:17.5327486Z inflating: build/bin/ProcessGroupGlooAsyncTest 2022-08-17T12:39:17.5393946Z inflating: build/bin/ProcessGroupNCCLTest 2022-08-17T12:39:17.5457793Z inflating: build/bin/ProcessGroupNCCLErrorsTest 2022-08-17T12:39:17.5516250Z inflating: build/bin/test_dist_autograd 2022-08-17T12:39:17.5591639Z inflating: build/bin/test_cpp_rpc 2022-08-17T12:39:17.5595105Z inflating: build/bin/parallel_benchmark 2022-08-17T12:39:17.5669989Z inflating: build/bin/test_mobile_nnc 2022-08-17T12:39:17.5682124Z inflating: build/bin/aot_model_compiler_test 2022-08-17T12:39:17.6589487Z inflating: build/bin/test_tensorexpr 2022-08-17T12:39:17.6974727Z inflating: build/bin/test_lazy 2022-08-17T12:39:17.6981074Z inflating: build/bin/torch_shm_manager 2022-08-17T12:39:17.8283647Z inflating: build/bin/test_api 2022-08-17T12:39:17.9354601Z inflating: build/bin/test_jit 2022-08-17T12:39:17.9356476Z inflating: .pytorch-test-times.json 2022-08-17T12:39:17.9389033Z ##[group]Run df -H 2022-08-17T12:39:17.9389303Z df -H 2022-08-17T12:39:17.9402701Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-08-17T12:39:17.9403008Z env: 2022-08-17T12:39:17.9403256Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:39:17.9403511Z GPU_FLAG: --gpus all 2022-08-17T12:39:17.9403763Z ##[endgroup] 2022-08-17T12:39:17.9442806Z Filesystem Size Used Avail Use% Mounted on 2022-08-17T12:39:17.9443153Z devtmpfs 129G 0 129G 0% /dev 2022-08-17T12:39:17.9443431Z tmpfs 129G 0 129G 0% /dev/shm 2022-08-17T12:39:17.9443716Z tmpfs 129G 529k 129G 1% /run 2022-08-17T12:39:17.9444009Z tmpfs 129G 0 129G 0% /sys/fs/cgroup 2022-08-17T12:39:17.9444475Z /dev/xvda1 162G 30G 132G 19% / 2022-08-17T12:39:18.0792686Z ##[group]Run .github/scripts/parse_ref.py 2022-08-17T12:39:18.0793060Z .github/scripts/parse_ref.py 2022-08-17T12:39:18.0805188Z shell: /usr/bin/bash -e {0} 2022-08-17T12:39:18.0805422Z env: 2022-08-17T12:39:18.0805669Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:39:18.0805944Z GPU_FLAG: --gpus all 2022-08-17T12:39:18.0806179Z ##[endgroup] 2022-08-17T12:39:18.2076240Z ##[group]Run set -x 2022-08-17T12:39:18.2076630Z set -x 2022-08-17T12:39:18.2076871Z  2022-08-17T12:39:18.2077137Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2022-08-17T12:39:18.2077508Z  TEST_COMMAND=.jenkins/pytorch/multigpu-test.sh 2022-08-17T12:39:18.2077879Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2022-08-17T12:39:18.2078218Z  TEST_COMMAND=.jenkins/caffe2/test.sh 2022-08-17T12:39:18.2078485Z else 2022-08-17T12:39:18.2078783Z  TEST_COMMAND=.jenkins/pytorch/test.sh 2022-08-17T12:39:18.2079070Z fi 2022-08-17T12:39:18.2079280Z  2022-08-17T12:39:18.2079617Z COMMIT_MESSAGES=$(git cherry -v "origin/${GIT_DEFAULT_BRANCH:-master}") 2022-08-17T12:39:18.2080077Z  2022-08-17T12:39:18.2080370Z # sanitize the input commit message and PR body here: 2022-08-17T12:39:18.2080676Z # 2022-08-17T12:39:18.2081081Z # trim all new lines from commit messages + PR_BODY to avoid issues with batch environment 2022-08-17T12:39:18.2081601Z # variable copying. see https://github.com/pytorch/pytorch/pull/80043#issuecomment-1167796028 2022-08-17T12:39:18.2082089Z COMMIT_MESSAGES="${COMMIT_MESSAGES//[$'\n\r']}" 2022-08-17T12:39:18.2082421Z PR_BODY="${PR_BODY//[$'\n\r']}" 2022-08-17T12:39:18.2082689Z  2022-08-17T12:39:18.2083048Z # then trim all special characters like single and double quotes to avoid unescaped inputs to 2022-08-17T12:39:18.2083448Z # wreak havoc internally 2022-08-17T12:39:18.2083788Z export COMMIT_MESSAGES="${COMMIT_MESSAGES//[\'\"]}" 2022-08-17T12:39:18.2084120Z export PR_BODY="${PR_BODY//[\'\"]}" 2022-08-17T12:39:18.2084402Z  2022-08-17T12:39:18.2084728Z # detached container should get cleaned up by teardown_ec2_linux 2022-08-17T12:39:18.2085154Z # TODO: Stop building test binaries as part of the build phase 2022-08-17T12:39:18.2085534Z # Used for GPU_FLAG since that doesn't play nice 2022-08-17T12:39:18.2085883Z # shellcheck disable=SC2086,SC2090 2022-08-17T12:39:18.2086200Z container_name=$(docker run \ 2022-08-17T12:39:18.2086474Z  ${GPU_FLAG:-} \ 2022-08-17T12:39:18.2086759Z  -e BUILD_ENVIRONMENT \ 2022-08-17T12:39:18.2087045Z  -e PR_NUMBER \ 2022-08-17T12:39:18.2087310Z  -e GITHUB_ACTIONS \ 2022-08-17T12:39:18.2087582Z  -e BASE_SHA \ 2022-08-17T12:39:18.2087843Z  -e BRANCH \ 2022-08-17T12:39:18.2088085Z  -e SHA1 \ 2022-08-17T12:39:18.2088359Z  -e AWS_DEFAULT_REGION \ 2022-08-17T12:39:18.2088644Z  -e IN_WHEEL_TEST \ 2022-08-17T12:39:18.2088909Z  -e SHARD_NUMBER \ 2022-08-17T12:39:18.2089186Z  -e TEST_CONFIG \ 2022-08-17T12:39:18.2089466Z  -e NUM_TEST_SHARDS \ 2022-08-17T12:39:18.2089741Z  -e PR_BODY \ 2022-08-17T12:39:18.2090002Z  -e COMMIT_MESSAGES \ 2022-08-17T12:39:18.2090309Z  -e PYTORCH_RETRY_TEST_CASES \ 2022-08-17T12:39:18.2090641Z  -e PYTORCH_OVERRIDE_FLAKY_SIGNAL \ 2022-08-17T12:39:18.2090929Z  -e PR_LABELS \ 2022-08-17T12:39:18.2091235Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2022-08-17T12:39:18.2091542Z  -e SCCACHE_BUCKET \ 2022-08-17T12:39:18.2091798Z  -e XLA_CUDA \ 2022-08-17T12:39:18.2092098Z  -e XLA_CLANG_CACHE_S3_BUCKET_NAME \ 2022-08-17T12:39:18.2092457Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2022-08-17T12:39:18.2092778Z  --ulimit stack=10485760:83886080 \ 2022-08-17T12:39:18.2093107Z  --security-opt seccomp=unconfined \ 2022-08-17T12:39:18.2093443Z  --cap-add=SYS_PTRACE \ 2022-08-17T12:39:18.2093728Z  --ipc=host \ 2022-08-17T12:39:18.2093995Z  --shm-size="${SHM_SIZE}" \ 2022-08-17T12:39:18.2094268Z  --tty \ 2022-08-17T12:39:18.2094591Z  --detach \ 2022-08-17T12:39:18.2094871Z  --name="${container_name}" \ 2022-08-17T12:39:18.2095164Z  --user jenkins \ 2022-08-17T12:39:18.2095508Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2022-08-17T12:39:18.2095858Z  -w /var/lib/jenkins/workspace \ 2022-08-17T12:39:18.2096158Z  "${DOCKER_IMAGE}" 2022-08-17T12:39:18.2096410Z ) 2022-08-17T12:39:18.2096749Z docker exec -t "${container_name}" sh -c "pip install dist/*.whl && ${TEST_COMMAND}" 2022-08-17T12:39:18.2109257Z shell: /usr/bin/bash -e {0} 2022-08-17T12:39:18.2109510Z env: 2022-08-17T12:39:18.2109733Z GIT_DEFAULT_BRANCH: master 2022-08-17T12:39:18.2110002Z GPU_FLAG: --gpus all 2022-08-17T12:39:18.2110427Z BUILD_ENVIRONMENT: linux-bionic-cuda11.6-py3.10-gcc7 2022-08-17T12:39:18.2110746Z PR_NUMBER: 82657 2022-08-17T12:39:18.2110977Z BRANCH: pull/82657 2022-08-17T12:39:18.2111278Z SHA1: ce6a3c605df99d1df57c0dda75c06d748e54ed2a 2022-08-17T12:39:18.2111610Z BASE_SHA: 343b5f86512f75f8f3bd4b90749c0459743b9e72 2022-08-17T12:39:18.2111894Z PYTORCH_RETRY_TEST_CASES: 1 2022-08-17T12:39:18.2112182Z PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1 2022-08-17T12:39:18.2112463Z TEST_CONFIG: distributed 2022-08-17T12:39:18.2112700Z SHARD_NUMBER: 2 2022-08-17T12:39:18.2112941Z NUM_TEST_SHARDS: 2 2022-08-17T12:39:18.2113729Z PR_BODY: ### Description This PR replaces `DecompositionInterpreter` with a context manager which decomposes a function into prims only if the decomposition is executable by nvFuser. Partitioning of the graph is removed because it's handled by the `execute(..., executor="nvfuser")` function. ### Testing Existing tests in `test/test_fx_backends.py`. 2022-08-17T12:39:18.2114488Z SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2 2022-08-17T12:39:18.2114778Z SHM_SIZE: 2g 2022-08-17T12:39:18.2115262Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:a347f7e7645f04fc68e4f87c73cf0385233153b8 2022-08-17T12:39:18.2115729Z XLA_CUDA: 2022-08-17T12:39:18.2116082Z XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla 2022-08-17T12:39:18.2116420Z ##[endgroup] 2022-08-17T12:39:18.2146070Z + [[ distributed == \m\u\l\t\i\g\p\u ]] 2022-08-17T12:39:18.2146961Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *onnx* ]] 2022-08-17T12:39:18.2147455Z + TEST_COMMAND=.jenkins/pytorch/test.sh 2022-08-17T12:39:18.2150189Z ++ git cherry -v origin/master 2022-08-17T12:39:18.2275399Z + COMMIT_MESSAGES='+ 52429d4ec9d162496e677edd78a883288d617a90 Add a helper for make_fx 2022-08-17T12:39:18.2276309Z + 80daf93686965f0383b5fa106ed8782b6905d891 Map torch.ops.aten calls to refs using decomposition_table 2022-08-17T12:39:18.2277348Z + e03d99a2da7e26168d8878f4e77d86f9c1af572a Use TorchRefsNvfuserCapabilityMode for fx.passes.backends.nvfuser 2022-08-17T12:39:18.2278358Z + a2242f67da754bd2714e056764d0cea51f8cdaed Fix output of PartitionedInterpreter to use _out_spec 2022-08-17T12:39:18.2279249Z + fcca9240398d1cf986a5606a61c50f7903d4ee96 Remove incorrect asserts from tests 2022-08-17T12:39:18.2280337Z + 209563d78b9998025726c92d4fa419151454f79e Add a link to the discussion about torch.ops.aten->refs path 2022-08-17T12:39:18.2281256Z + d3f1f74e60a1d82de78a545b7a7a177e3de6c8a2 Revert changes to torch/fx/passes/backends/nvfuser.py 2022-08-17T12:39:18.2282409Z + ce6a3c605df99d1df57c0dda75c06d748e54ed2a Add aten->refs->prims under TorchRefsMode test' 2022-08-17T12:39:18.2289006Z + COMMIT_MESSAGES='+ 52429d4ec9d162496e677edd78a883288d617a90 Add a helper for make_fx+ 80daf93686965f0383b5fa106ed8782b6905d891 Map torch.ops.aten calls to refs using decomposition_table+ e03d99a2da7e26168d8878f4e77d86f9c1af572a Use TorchRefsNvfuserCapabilityMode for fx.passes.backends.nvfuser+ a2242f67da754bd2714e056764d0cea51f8cdaed Fix output of PartitionedInterpreter to use _out_spec+ fcca9240398d1cf986a5606a61c50f7903d4ee96 Remove incorrect asserts from tests+ 209563d78b9998025726c92d4fa419151454f79e Add a link to the discussion about torch.ops.aten->refs path+ d3f1f74e60a1d82de78a545b7a7a177e3de6c8a2 Revert changes to torch/fx/passes/backends/nvfuser.py+ ce6a3c605df99d1df57c0dda75c06d748e54ed2a Add aten->refs->prims under TorchRefsMode test' 2022-08-17T12:39:18.2293017Z + PR_BODY='### DescriptionThis PR replaces `DecompositionInterpreter` with a context manager which decomposes a function into prims only if the decomposition is executable by nvFuser.Partitioning of the graph is removed because it'\''s handled by the `execute(..., executor="nvfuser")` function.### TestingExisting tests in `test/test_fx_backends.py`.' 2022-08-17T12:39:18.2299805Z + export 'COMMIT_MESSAGES=+ 52429d4ec9d162496e677edd78a883288d617a90 Add a helper for make_fx+ 80daf93686965f0383b5fa106ed8782b6905d891 Map torch.ops.aten calls to refs using decomposition_table+ e03d99a2da7e26168d8878f4e77d86f9c1af572a Use TorchRefsNvfuserCapabilityMode for fx.passes.backends.nvfuser+ a2242f67da754bd2714e056764d0cea51f8cdaed Fix output of PartitionedInterpreter to use _out_spec+ fcca9240398d1cf986a5606a61c50f7903d4ee96 Remove incorrect asserts from tests+ 209563d78b9998025726c92d4fa419151454f79e Add a link to the discussion about torch.ops.aten->refs path+ d3f1f74e60a1d82de78a545b7a7a177e3de6c8a2 Revert changes to torch/fx/passes/backends/nvfuser.py+ ce6a3c605df99d1df57c0dda75c06d748e54ed2a Add aten->refs->prims under TorchRefsMode test' 2022-08-17T12:39:18.2305434Z + COMMIT_MESSAGES='+ 52429d4ec9d162496e677edd78a883288d617a90 Add a helper for make_fx+ 80daf93686965f0383b5fa106ed8782b6905d891 Map torch.ops.aten calls to refs using decomposition_table+ e03d99a2da7e26168d8878f4e77d86f9c1af572a Use TorchRefsNvfuserCapabilityMode for fx.passes.backends.nvfuser+ a2242f67da754bd2714e056764d0cea51f8cdaed Fix output of PartitionedInterpreter to use _out_spec+ fcca9240398d1cf986a5606a61c50f7903d4ee96 Remove incorrect asserts from tests+ 209563d78b9998025726c92d4fa419151454f79e Add a link to the discussion about torch.ops.aten->refs path+ d3f1f74e60a1d82de78a545b7a7a177e3de6c8a2 Revert changes to torch/fx/passes/backends/nvfuser.py+ ce6a3c605df99d1df57c0dda75c06d748e54ed2a Add aten->refs->prims under TorchRefsMode test' 2022-08-17T12:39:18.2307360Z + export 'PR_BODY=### DescriptionThis PR replaces `DecompositionInterpreter` with a context manager which decomposes a function into prims only if the decomposition is executable by nvFuser.Partitioning of the graph is removed because its handled by the `execute(..., executor=nvfuser)` function.### TestingExisting tests in `test/test_fx_backends.py`.' 2022-08-17T12:39:18.2308783Z + PR_BODY='### DescriptionThis PR replaces `DecompositionInterpreter` with a context manager which decomposes a function into prims only if the decomposition is executable by nvFuser.Partitioning of the graph is removed because its handled by the `execute(..., executor=nvfuser)` function.### TestingExisting tests in `test/test_fx_backends.py`.' 2022-08-17T12:39:18.2311749Z +++ nproc --ignore=2 2022-08-17T12:39:18.4848473Z ++ docker run --gpus all -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e PR_BODY -e COMMIT_MESSAGES -e PYTORCH_RETRY_TEST_CASES -e PYTORCH_OVERRIDE_FLAKY_SIGNAL -e PR_LABELS -e MAX_JOBS=30 -e SCCACHE_BUCKET -e XLA_CUDA -e XLA_CLANG_CACHE_S3_BUCKET_NAME --env-file=/tmp/github_env_2875102080 --ulimit stack=10485760:83886080 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --ipc=host --shm-size=2g --tty --detach --name= --user jenkins -v /home/ec2-user/actions-runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:a347f7e7645f04fc68e4f87c73cf0385233153b8 2022-08-17T12:39:36.4252637Z + container_name=20a8245f11468fcac7d3aeef57d0511f555b38bc9ada231a407ac49b94964c98 2022-08-17T12:39:36.4253710Z + docker exec -t 20a8245f11468fcac7d3aeef57d0511f555b38bc9ada231a407ac49b94964c98 sh -c 'pip install dist/*.whl && .jenkins/pytorch/test.sh' 2022-08-17T12:39:36.9889889Z Processing ./dist/torch-1.13.0a0+gitce6a3c6-cp310-cp310-linux_x86_64.whl 2022-08-17T12:39:37.0768755Z Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.10/site-packages (from torch==1.13.0a0+gitce6a3c6) (4.3.0) 2022-08-17T12:39:37.6097546Z Installing collected packages: torch 2022-08-17T12:39:47.9363286Z Successfully installed torch-1.13.0a0+gitce6a3c6 2022-08-17T12:39:48.0015868Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2022-08-17T12:39:48.0238495Z + TORCH_INSTALL_DIR=/opt/conda/lib/python3.10/site-packages/torch 2022-08-17T12:39:48.0239140Z + TORCH_BIN_DIR=/opt/conda/lib/python3.10/site-packages/torch/bin 2022-08-17T12:39:48.0243978Z + TORCH_LIB_DIR=/opt/conda/lib/python3.10/site-packages/torch/lib 2022-08-17T12:39:48.0244497Z + TORCH_TEST_DIR=/opt/conda/lib/python3.10/site-packages/torch/test 2022-08-17T12:39:48.0244940Z + BUILD_DIR=build 2022-08-17T12:39:48.0245344Z + BUILD_RENAMED_DIR=build_renamed 2022-08-17T12:39:48.0245632Z + BUILD_BIN_DIR=build/bin 2022-08-17T12:39:48.0246026Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 != *bazel* ]] 2022-08-17T12:39:48.0246376Z ++ realpath build/custom_test_artifacts 2022-08-17T12:39:48.0248475Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/workspace/build/custom_test_artifacts 2022-08-17T12:39:48.0251894Z ++ dirname .jenkins/pytorch/test.sh 2022-08-17T12:39:48.0258689Z + source .jenkins/pytorch/common.sh 2022-08-17T12:39:48.0262594Z +++ dirname .jenkins/pytorch/common.sh 2022-08-17T12:39:48.0273507Z ++ source .jenkins/pytorch/common_utils.sh 2022-08-17T12:39:48.0275202Z +++ declare -f -t trap_add 2022-08-17T12:39:48.0279758Z ++ set -ex 2022-08-17T12:39:48.0280155Z ++ [[ linux-bionic-cuda11.6-py3.10-gcc7 == *rocm* ]] 2022-08-17T12:39:48.0280477Z ++ BUILD_TEST_LIBTORCH=0 2022-08-17T12:39:48.0280912Z ++ [[ distributed == *xla* ]] 2022-08-17T12:39:48.0281403Z ++ [[ linux-bionic-cuda11.6-py3.10-gcc7 == *centos* ]] 2022-08-17T12:39:48.0281878Z ++ [[ linux-bionic-cuda11.6-py3.10-gcc7 == *linux-bionic* ]] 2022-08-17T12:39:48.0282180Z ++ which conda 2022-08-17T12:39:48.0291061Z /opt/conda/bin/conda 2022-08-17T12:39:48.0291398Z ++ conda install -q -y cmake 2022-08-17T12:39:50.4388870Z Collecting package metadata (current_repodata.json): ...working... done 2022-08-17T12:39:50.9038037Z Solving environment: ...working... done 2022-08-17T12:39:51.0075377Z 2022-08-17T12:39:51.0075793Z ## Package Plan ## 2022-08-17T12:39:51.0075970Z 2022-08-17T12:39:51.0076195Z environment location: /opt/conda 2022-08-17T12:39:51.0076504Z 2022-08-17T12:39:51.0076635Z added / updated specs: 2022-08-17T12:39:51.0079650Z - cmake 2022-08-17T12:39:51.0079824Z 2022-08-17T12:39:51.0080007Z 2022-08-17T12:39:51.0080417Z The following packages will be downloaded: 2022-08-17T12:39:51.0080804Z 2022-08-17T12:39:51.0081045Z package | build 2022-08-17T12:39:51.0081535Z ---------------------------|----------------- 2022-08-17T12:39:51.0081977Z c-ares-1.18.1 | h7f8727e_0 114 KB 2022-08-17T12:39:51.0082368Z cmake-3.22.1 | h1fce559_0 7.3 MB 2022-08-17T12:39:51.0082796Z expat-2.4.4 | h295c915_0 169 KB 2022-08-17T12:39:51.0083189Z krb5-1.19.2 | hac12032_0 1.2 MB 2022-08-17T12:39:51.0083584Z libcurl-7.84.0 | h91b91d3_0 337 KB 2022-08-17T12:39:51.0083964Z libedit-3.1.20210910 | h7f8727e_0 166 KB 2022-08-17T12:39:51.0084354Z libev-4.33 | h7f8727e_1 111 KB 2022-08-17T12:39:51.0084749Z libnghttp2-1.46.0 | hce63b2e_0 680 KB 2022-08-17T12:39:51.0085319Z libssh2-1.10.0 | h8f2d780_0 274 KB 2022-08-17T12:39:51.0085733Z libuv-1.40.0 | h7b6447c_0 736 KB 2022-08-17T12:39:51.0086107Z lz4-c-1.9.3 | h295c915_1 185 KB 2022-08-17T12:39:51.0086689Z rhash-1.4.1 | h3c74f83_1 203 KB 2022-08-17T12:39:51.0087104Z zstd-1.5.2 | ha4553b6_0 488 KB 2022-08-17T12:39:51.0087509Z ------------------------------------------------------------ 2022-08-17T12:39:51.0087849Z Total: 11.9 MB 2022-08-17T12:39:51.0088007Z 2022-08-17T12:39:51.0088169Z The following NEW packages will be INSTALLED: 2022-08-17T12:39:51.0088384Z 2022-08-17T12:39:51.0089026Z c-ares pkgs/main/linux-64::c-ares-1.18.1-h7f8727e_0 2022-08-17T12:39:51.0089762Z cmake pkgs/main/linux-64::cmake-3.22.1-h1fce559_0 2022-08-17T12:39:51.0090781Z expat pkgs/main/linux-64::expat-2.4.4-h295c915_0 2022-08-17T12:39:51.0091696Z krb5 pkgs/main/linux-64::krb5-1.19.2-hac12032_0 2022-08-17T12:39:51.0092173Z libcurl pkgs/main/linux-64::libcurl-7.84.0-h91b91d3_0 2022-08-17T12:39:51.0092686Z libedit pkgs/main/linux-64::libedit-3.1.20210910-h7f8727e_0 2022-08-17T12:39:51.0093172Z libev pkgs/main/linux-64::libev-4.33-h7f8727e_1 2022-08-17T12:39:51.0093651Z libnghttp2 pkgs/main/linux-64::libnghttp2-1.46.0-hce63b2e_0 2022-08-17T12:39:51.0094150Z libssh2 pkgs/main/linux-64::libssh2-1.10.0-h8f2d780_0 2022-08-17T12:39:51.0094625Z libuv pkgs/main/linux-64::libuv-1.40.0-h7b6447c_0 2022-08-17T12:39:51.0095071Z lz4-c pkgs/main/linux-64::lz4-c-1.9.3-h295c915_1 2022-08-17T12:39:51.0095530Z rhash pkgs/main/linux-64::rhash-1.4.1-h3c74f83_1 2022-08-17T12:39:51.0095991Z zstd pkgs/main/linux-64::zstd-1.5.2-ha4553b6_0 2022-08-17T12:39:51.0096199Z 2022-08-17T12:39:51.0096218Z 2022-08-17T12:39:52.0338962Z Preparing transaction: ...working... done 2022-08-17T12:39:52.5270115Z Verifying transaction: ...working... done 2022-08-17T12:39:52.9593092Z Executing transaction: ...working... done 2022-08-17T12:39:53.1474809Z ++ [[ linux-bionic-cuda11.6-py3.10-gcc7 == *centos* ]] 2022-08-17T12:39:53.1475264Z + echo 'Environment variables' 2022-08-17T12:39:53.1475546Z Environment variables 2022-08-17T12:39:53.1475770Z + env 2022-08-17T12:39:53.1484218Z SHARD_NUMBER=2 2022-08-17T12:39:53.1484800Z NV_LIBCUBLAS_DEV_VERSION=11.9.2.110-1 2022-08-17T12:39:53.1485433Z NV_CUDA_COMPAT_PACKAGE=cuda-compat-11-6 2022-08-17T12:39:53.1486073Z LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2022-08-17T12:39:53.1486850Z NV_LIBNCCL_DEV_PACKAGE=libnccl-dev=2.12.10-1+cuda11.6 2022-08-17T12:39:53.1487352Z UCC_HOME=/usr 2022-08-17T12:39:53.1488023Z BUILD_ENVIRONMENT=linux-bionic-cuda11.6-py3.10-gcc7 2022-08-17T12:39:53.1488794Z NV_LIBNPP_DEV_PACKAGE=libnpp-dev-11-6=11.6.3.124-1 2022-08-17T12:39:53.1489333Z INSTALLED_DB=yes 2022-08-17T12:39:53.1489723Z HOSTNAME=20a8245f1146 2022-08-17T12:39:53.1489992Z GITHUB_REF_NAME=82657/merge 2022-08-17T12:39:53.1490312Z GITHUB_API_URL=https://api.github.com 2022-08-17T12:39:53.1490598Z OPENSSL_DIR=/opt/openssl 2022-08-17T12:39:53.1490925Z UCC_COMMIT=a7bda274b10f8adf5bb729f01da064f4e735fb23 2022-08-17T12:39:53.1491684Z GITHUB_STEP_SUMMARY=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/step_summary_09b3a9c0-b000-4dc4-ab2f-cb40103f998c 2022-08-17T12:39:53.1492391Z CUDA_PATH=/usr/local/cuda 2022-08-17T12:39:53.1493187Z GITHUB_ACTION_PATH=/home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-linux 2022-08-17T12:39:53.1493765Z GITHUB_RUN_ATTEMPT=1 2022-08-17T12:39:53.1494179Z TEST_CONFIG=distributed 2022-08-17T12:39:53.1494730Z NV_LIBNPP_VERSION=11.6.3.124-1 2022-08-17T12:39:53.1495202Z NV_NVPROF_DEV_PACKAGE=cuda-nvprof-11-6=11.6.124-1 2022-08-17T12:39:53.1495509Z GITHUB_REPOSITORY_OWNER=pytorch 2022-08-17T12:39:53.1495782Z GITHUB_ACTIONS=true 2022-08-17T12:39:53.1496103Z NVIDIA_VISIBLE_DEVICES=all 2022-08-17T12:39:53.1496645Z NV_NVPROF_VERSION=11.6.124-1 2022-08-17T12:39:53.1497179Z NV_LIBCUSPARSE_VERSION=11.7.2.124-1 2022-08-17T12:39:53.1497532Z CI=true 2022-08-17T12:39:53.1498049Z PYTORCH_OVERRIDE_FLAKY_SIGNAL=1 2022-08-17T12:39:53.1498472Z NV_LIBCUBLAS_DEV_PACKAGE=libcublas-dev-11-6=11.9.2.110-1 2022-08-17T12:39:53.1498782Z BRANCH=pull/82657 2022-08-17T12:39:53.1499220Z GITHUB_HEAD_REF=fx-passes-nvfuser 2022-08-17T12:39:53.1499497Z UCX_COMMIT=v1.13.x 2022-08-17T12:39:53.1499754Z GITHUB_ACTOR=IvanYashchuk 2022-08-17T12:39:53.1500066Z CMAKE_CUDA_COMPILER_LAUNCHER=/opt/cache/bin/sccache 2022-08-17T12:39:53.1500365Z GITHUB_ACTION_REF= 2022-08-17T12:39:53.1500623Z NCCL_VERSION=2.12.10-1 2022-08-17T12:39:53.1500880Z GITHUB_ACTION=__self 2022-08-17T12:39:53.1501144Z GITHUB_REF_PROTECTED=false 2022-08-17T12:39:53.1501580Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2022-08-17T12:39:53.1504166Z *** 2022-08-17T12:39:53.1504590Z INSTALLED_VISION=yes 2022-08-17T12:39:53.1504833Z NVARCH=x86_64 2022-08-17T12:39:53.1505180Z NV_LIBCUSPARSE_DEV_VERSION=11.7.2.124-1 2022-08-17T12:39:53.1505479Z HOME=/var/lib/jenkins 2022-08-17T12:39:53.1505769Z CARGO_NET_GIT_FETCH_WITH_CLI=true 2022-08-17T12:39:53.1506054Z GITHUB_ACTION_REPOSITORY= 2022-08-17T12:39:53.1506336Z GITHUB_REF_TYPE=branch 2022-08-17T12:39:53.1506663Z NV_LIBNCCL_PACKAGE_VERSION=2.12.10-1 2022-08-17T12:39:53.1506947Z GITHUB_RETENTION_DAYS=90 2022-08-17T12:39:53.1507354Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2022-08-17T12:39:53.1507791Z NV_LIBNCCL_PACKAGE=libnccl2=2.12.10-1+cuda11.6 2022-08-17T12:39:53.1508360Z GITHUB_ENV=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_env_09b3a9c0-b000-4dc4-ab2f-cb40103f998c 2022-08-17T12:39:53.1508796Z DEBIAN_FRONTEND=noninteractive 2022-08-17T12:39:53.1509174Z NV_LIBNCCL_DEV_PACKAGE_NAME=libnccl-dev 2022-08-17T12:39:53.1509474Z GITHUB_REF=refs/pull/82657/merge 2022-08-17T12:39:53.1509796Z NV_CUDA_LIB_VERSION=11.6.2-1 2022-08-17T12:39:53.1510128Z GITHUB_SHA=e9efe573711d9f8141e152403ee143f5875669e7 2022-08-17T12:39:53.1510421Z INSTALLED_PROTOBUF=yes 2022-08-17T12:39:53.1510696Z GITHUB_RUN_ID=2875102080 2022-08-17T12:39:53.1511067Z NV_LIBNPP_PACKAGE=libnpp-11-6=11.6.3.124-1 2022-08-17T12:39:53.1511379Z NV_LIBNCCL_PACKAGE_NAME=libnccl2 2022-08-17T12:39:53.1511697Z LIBRARY_PATH=/usr/local/cuda/lib64/stubs 2022-08-17T12:39:53.1512030Z NV_NVTX_VERSION=11.6.124-1 2022-08-17T12:39:53.1512333Z GITHUB_SERVER_URL=https://github.com 2022-08-17T12:39:53.1512624Z MAX_JOBS=30 2022-08-17T12:39:53.1512921Z NV_LIBCUBLAS_VERSION=11.9.2.110-1 2022-08-17T12:39:53.1513321Z NV_LIBCUBLAS_PACKAGE=libcublas-11-6=11.9.2.110-1 2022-08-17T12:39:53.1513821Z GITHUB_EVENT_PATH=/home/ec2-user/actions-runner/_work/_temp/_github_workflow/event.json 2022-08-17T12:39:53.1514189Z UCX_HOME=/usr 2022-08-17T12:39:53.1514455Z PYTORCH_RETRY_TEST_CASES=1 2022-08-17T12:39:53.1514787Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2022-08-17T12:39:53.1515161Z BASE_SHA=343b5f86512f75f8f3bd4b90749c0459743b9e72 2022-08-17T12:39:53.1515515Z NV_CUDA_CUDART_DEV_VERSION=11.6.55-1 2022-08-17T12:39:53.1516301Z PR_BODY=### DescriptionThis PR replaces `DecompositionInterpreter` with a context manager which decomposes a function into prims only if the decomposition is executable by nvFuser.Partitioning of the graph is removed because its handled by the `execute(..., executor=nvfuser)` function.### TestingExisting tests in `test/test_fx_backends.py`. 2022-08-17T12:39:53.1517118Z GITHUB_BASE_REF=master 2022-08-17T12:39:53.1517375Z TERM=xterm 2022-08-17T12:39:53.1517610Z XLA_CUDA= 2022-08-17T12:39:53.1517883Z NV_NVML_DEV_VERSION=11.6.55-1 2022-08-17T12:39:53.1518173Z TORCH_CUDA_ARCH_LIST=Maxwell 2022-08-17T12:39:53.1518447Z CUDA_VERSION=11.6.2 2022-08-17T12:39:53.1518791Z NV_LIBCUBLAS_PACKAGE_NAME=libcublas-11-6 2022-08-17T12:39:53.1519112Z OPENSSL_ROOT_DIR=/opt/openssl 2022-08-17T12:39:53.1519687Z GITHUB_PATH=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/add_path_09b3a9c0-b000-4dc4-ab2f-cb40103f998c 2022-08-17T12:39:53.1520109Z GITHUB_JOB=test 2022-08-17T12:39:53.1522105Z COMMIT_MESSAGES=+ 52429d4ec9d162496e677edd78a883288d617a90 Add a helper for make_fx+ 80daf93686965f0383b5fa106ed8782b6905d891 Map torch.ops.aten calls to refs using decomposition_table+ e03d99a2da7e26168d8878f4e77d86f9c1af572a Use TorchRefsNvfuserCapabilityMode for fx.passes.backends.nvfuser+ a2242f67da754bd2714e056764d0cea51f8cdaed Fix output of PartitionedInterpreter to use _out_spec+ fcca9240398d1cf986a5606a61c50f7903d4ee96 Remove incorrect asserts from tests+ 209563d78b9998025726c92d4fa419151454f79e Add a link to the discussion about torch.ops.aten->refs path+ d3f1f74e60a1d82de78a545b7a7a177e3de6c8a2 Revert changes to torch/fx/passes/backends/nvfuser.py+ ce6a3c605df99d1df57c0dda75c06d748e54ed2a Add aten->refs->prims under TorchRefsMode test 2022-08-17T12:39:53.1523394Z NVIDIA_DRIVER_CAPABILITIES=compute,utility 2022-08-17T12:39:53.1523681Z NUM_TEST_SHARDS=2 2022-08-17T12:39:53.1523934Z PR_NUMBER=82657 2022-08-17T12:39:53.1524235Z SHLVL=1 2022-08-17T12:39:53.1524589Z NV_LIBCUBLAS_DEV_PACKAGE_NAME=libcublas-dev-11-6 2022-08-17T12:39:53.1524940Z GITHUB_REPOSITORY=pytorch/pytorch 2022-08-17T12:39:53.1525596Z NVIDIA_REQUIRE_CUDA=cuda>=11.6 brand=tesla,driver>=418,driver<419 brand=tesla,driver>=450,driver<451 brand=tesla,driver>=470,driver<471 brand=unknown,driver>=470,driver<471 brand=nvidia,driver>=470,driver<471 brand=nvidiartx,driver>=470,driver<471 brand=quadrortx,driver>=470,driver<471 2022-08-17T12:39:53.1526239Z NV_LIBNPP_DEV_VERSION=11.6.3.124-1 2022-08-17T12:39:53.1526552Z SHA1=ce6a3c605df99d1df57c0dda75c06d748e54ed2a 2022-08-17T12:39:53.1526873Z GITHUB_EVENT_NAME=pull_request 2022-08-17T12:39:53.1527205Z NV_CUDA_CUDART_VERSION=11.6.55-1 2022-08-17T12:39:53.1527565Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2022-08-17T12:39:53.1527868Z GITHUB_RUN_NUMBER=40757 2022-08-17T12:39:53.1528139Z GITHUB_WORKFLOW=pull 2022-08-17T12:39:53.1528566Z PATH=/opt/cache/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2022-08-17T12:39:53.1529058Z NV_LIBNCCL_DEV_PACKAGE_VERSION=2.12.10-1 2022-08-17T12:39:53.1529528Z GITHUB_WORKSPACE=/home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-08-17T12:39:53.1529892Z GITHUB_TRIGGERING_ACTOR=IvanYashchuk 2022-08-17T12:39:53.1530175Z _=/usr/bin/env 2022-08-17T12:39:53.1530481Z + echo 'Testing pytorch' 2022-08-17T12:39:53.1530731Z Testing pytorch 2022-08-17T12:39:53.1531020Z + export LANG=C.UTF-8 2022-08-17T12:39:53.1531298Z + LANG=C.UTF-8 2022-08-17T12:39:53.1531531Z + PR_NUMBER=82657 2022-08-17T12:39:53.1531812Z + [[ distributed == \d\e\f\a\u\l\t ]] 2022-08-17T12:39:53.1532127Z + [[ distributed == \d\i\s\t\r\i\b\u\t\e\d ]] 2022-08-17T12:39:53.1532557Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *rocm* ]] 2022-08-17T12:39:53.1532871Z + [[ distributed == \s\l\o\w ]] 2022-08-17T12:39:53.1533313Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *slow-gradcheck* ]] 2022-08-17T12:39:53.1533778Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *cuda* ]] 2022-08-17T12:39:53.1534131Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2022-08-17T12:39:53.1534470Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2022-08-17T12:39:53.1534903Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *cuda11* ]] 2022-08-17T12:39:53.1535227Z + export BUILD_SPLIT_CUDA=ON 2022-08-17T12:39:53.1535506Z + BUILD_SPLIT_CUDA=ON 2022-08-17T12:39:53.1535789Z + [[ distributed == *crossref* ]] 2022-08-17T12:39:53.1536062Z + [[ distributed == *dynamo* ]] 2022-08-17T12:39:53.1536369Z + [[ -n 82657 ]] 2022-08-17T12:39:53.1536643Z + [[ -z '' ]] 2022-08-17T12:39:53.1536932Z + export PYTORCH_TEST_SKIP_CUDA_MEM_LEAK_CHECK=1 2022-08-17T12:39:53.1537277Z + PYTORCH_TEST_SKIP_CUDA_MEM_LEAK_CHECK=1 2022-08-17T12:39:53.1537705Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *rocm* ]] 2022-08-17T12:39:53.1538143Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 != *-bazel-* ]] 2022-08-17T12:39:53.1538525Z + pip_install --user ninja 2022-08-17T12:39:53.1538911Z + pip install --progress-bar off --user ninja 2022-08-17T12:39:53.7399093Z Collecting ninja 2022-08-17T12:39:53.7602387Z Downloading ninja-1.10.2.3-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2022-08-17T12:39:54.2504616Z Installing collected packages: ninja 2022-08-17T12:39:54.2611715Z  WARNING: The script ninja is installed in '/var/lib/jenkins/.local/bin' which is not on PATH. 2022-08-17T12:39:54.2612413Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-08-17T12:39:54.2686312Z Successfully installed ninja-1.10.2.3 2022-08-17T12:39:54.3336616Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2022-08-17T12:39:54.3337932Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2022-08-17T12:39:54.3339115Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *asan* ]] 2022-08-17T12:39:54.3340047Z + [[ distributed == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2022-08-17T12:39:54.3340658Z + [[ distributed == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2022-08-17T12:39:54.3347151Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *tbb* ]] 2022-08-17T12:39:54.3362831Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *libtorch* ]] 2022-08-17T12:39:54.3363292Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *-bazel-* ]] 2022-08-17T12:39:54.3365899Z + cd test 2022-08-17T12:39:54.3366417Z + python -c 'import torch; print(torch.__config__.show())' 2022-08-17T12:39:55.7886691Z PyTorch built with: 2022-08-17T12:39:55.7887132Z - GCC 7.5 2022-08-17T12:39:55.7887450Z - C++ Version: 201402 2022-08-17T12:39:55.7887997Z - Intel(R) oneAPI Math Kernel Library Version 2022.0-Product Build 20211112 for Intel(R) 64 architecture applications 2022-08-17T12:39:55.7888572Z - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815) 2022-08-17T12:39:55.7888971Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2022-08-17T12:39:55.7889368Z - LAPACK is enabled (usually provided by MKL) 2022-08-17T12:39:55.7889699Z - NNPACK is enabled 2022-08-17T12:39:55.7889999Z - CPU capability usage: AVX2 2022-08-17T12:39:55.7890308Z - CUDA Runtime 11.6 2022-08-17T12:39:55.7890717Z - NVCC architecture flags: -gencode;arch=compute_52,code=sm_52 2022-08-17T12:39:55.7891102Z - CuDNN 8.3.2 (built against CUDA 11.5) 2022-08-17T12:39:55.7891409Z - Magma 2.6.1 2022-08-17T12:39:55.7894298Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.6, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Werror -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 2022-08-17T12:39:55.7896461Z 2022-08-17T12:39:55.9960815Z + cd test 2022-08-17T12:39:55.9961416Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2022-08-17T12:39:57.3486650Z ATen/Parallel: 2022-08-17T12:39:57.3487003Z at::get_num_threads() : 16 2022-08-17T12:39:57.3487301Z at::get_num_interop_threads() : 16 2022-08-17T12:39:57.3487626Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2022-08-17T12:39:57.3487890Z omp_get_max_threads() : 16 2022-08-17T12:39:57.3488806Z Intel(R) oneAPI Math Kernel Library Version 2022.0-Product Build 20211112 for Intel(R) 64 architecture applications 2022-08-17T12:39:57.3489222Z mkl_get_max_threads() : 16 2022-08-17T12:39:57.3489680Z Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815) 2022-08-17T12:39:57.3490040Z std::thread::hardware_concurrency() : 32 2022-08-17T12:39:57.3490336Z Environment variables: 2022-08-17T12:39:57.3490612Z OMP_NUM_THREADS : [not set] 2022-08-17T12:39:57.3490876Z MKL_NUM_THREADS : [not set] 2022-08-17T12:39:57.3491157Z ATen parallel backend: OpenMP 2022-08-17T12:39:57.3491338Z 2022-08-17T12:39:57.5455184Z + [[ distributed == *deploy* ]] 2022-08-17T12:39:57.5455530Z + [[ distributed == *backward* ]] 2022-08-17T12:39:57.5455806Z + [[ distributed == *xla* ]] 2022-08-17T12:39:57.5456102Z + [[ distributed == \j\i\t\_\l\e\g\a\c\y ]] 2022-08-17T12:39:57.5456936Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *libtorch* ]] 2022-08-17T12:39:57.5457262Z + [[ distributed == distributed ]] 2022-08-17T12:39:57.5457544Z + install_torchdynamo 2022-08-17T12:39:57.5457792Z + local commit 2022-08-17T12:39:57.5460858Z ++ get_pinned_commit torchdynamo 2022-08-17T12:39:57.5461188Z ++ cat .github/ci_commit_pins/torchdynamo.txt 2022-08-17T12:39:57.5477403Z + commit=f19410cd8204fa1c30ca72f81142508e128be66f 2022-08-17T12:39:57.5478034Z + pip_install --user git+https://github.com/pytorch/torchdynamo.git@f19410cd8204fa1c30ca72f81142508e128be66f 2022-08-17T12:39:57.5478728Z + pip install --progress-bar off --user git+https://github.com/pytorch/torchdynamo.git@f19410cd8204fa1c30ca72f81142508e128be66f 2022-08-17T12:39:58.0461658Z Collecting git+https://github.com/pytorch/torchdynamo.git@f19410cd8204fa1c30ca72f81142508e128be66f 2022-08-17T12:39:58.0467502Z Cloning https://github.com/pytorch/torchdynamo.git (to revision f19410cd8204fa1c30ca72f81142508e128be66f) to /tmp/pip-req-build-qw86a442 2022-08-17T12:39:58.0487536Z Running command git clone --filter=blob:none --quiet https://github.com/pytorch/torchdynamo.git /tmp/pip-req-build-qw86a442 2022-08-17T12:39:58.6794493Z Running command git rev-parse -q --verify 'sha^f19410cd8204fa1c30ca72f81142508e128be66f' 2022-08-17T12:39:58.6815112Z Running command git fetch -q https://github.com/pytorch/torchdynamo.git f19410cd8204fa1c30ca72f81142508e128be66f 2022-08-17T12:39:58.9193255Z Running command git checkout -q f19410cd8204fa1c30ca72f81142508e128be66f 2022-08-17T12:39:59.2594229Z Resolved https://github.com/pytorch/torchdynamo.git to commit f19410cd8204fa1c30ca72f81142508e128be66f 2022-08-17T12:40:02.7830323Z Preparing metadata (setup.py) ... [?25l- \ | / done 2022-08-17T12:40:02.7898832Z [?25hRequirement already satisfied: torch>=1.12.0 in /opt/conda/lib/python3.10/site-packages (from torchdynamo==1.13.0.dev0) (1.13.0a0+gitce6a3c6) 2022-08-17T12:40:02.7903049Z Requirement already satisfied: numpy in /opt/conda/lib/python3.10/site-packages (from torchdynamo==1.13.0.dev0) (1.21.2) 2022-08-17T12:40:02.8357464Z Collecting tabulate 2022-08-17T12:40:02.8568812Z Downloading tabulate-0.8.10-py3-none-any.whl (29 kB) 2022-08-17T12:40:02.8627450Z Requirement already satisfied: pyyaml in /opt/conda/lib/python3.10/site-packages/PyYAML-6.0-py3.10-linux-x86_64.egg (from torchdynamo==1.13.0.dev0) (6.0) 2022-08-17T12:40:02.8631954Z Requirement already satisfied: sympy in /opt/conda/lib/python3.10/site-packages (from torchdynamo==1.13.0.dev0) (1.10.1) 2022-08-17T12:40:02.8657231Z Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.10/site-packages (from torch>=1.12.0->torchdynamo==1.13.0.dev0) (4.3.0) 2022-08-17T12:40:02.8690202Z Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.10/site-packages (from sympy->torchdynamo==1.13.0.dev0) (1.2.1) 2022-08-17T12:40:02.8815122Z Building wheels for collected packages: torchdynamo 2022-08-17T12:40:08.8965639Z Building wheel for torchdynamo (setup.py) ... [?25l- \ | / - \ | / - \ done 2022-08-17T12:40:08.9060698Z [?25h Created wheel for torchdynamo: filename=torchdynamo-1.13.0.dev0-cp310-cp310-linux_x86_64.whl size=2565189 sha256=1af404c94523e1c1aa228062ddf25eeff3e5bf3ecf145a31045d2ecf78a975a1 2022-08-17T12:40:08.9062101Z Stored in directory: /var/lib/jenkins/.cache/pip/wheels/45/bf/dc/7c08f8612b59061a575859a873a5c29fae8bbc62da1b5efff0 2022-08-17T12:40:08.9084290Z Successfully built torchdynamo 2022-08-17T12:40:09.3730176Z Installing collected packages: tabulate, torchdynamo 2022-08-17T12:40:09.6997238Z Successfully installed tabulate-0.8.10 torchdynamo-1.13.0.dev0 2022-08-17T12:40:09.7786725Z + test_distributed 2022-08-17T12:40:09.7787182Z + echo 'Testing distributed python tests' 2022-08-17T12:40:09.7787505Z Testing distributed python tests 2022-08-17T12:40:09.7787948Z + python test/run_test.py --distributed-tests --shard 2 2 --verbose 2022-08-17T12:40:11.7280884Z Ignoring disabled issues: [] 2022-08-17T12:40:11.7447507Z /var/lib/jenkins/workspace/test/run_test.py:839: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-08-17T12:40:11.7448082Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-08-17T12:40:11.7450414Z Found test time stats from artifacts 2022-08-17T12:40:11.7451920Z Selected tests: 2022-08-17T12:40:11.7452473Z distributed/rpc/cuda/test_tensorpipe_agent 2022-08-17T12:40:11.7453030Z distributed/test_c10d_nccl 2022-08-17T12:40:11.7453482Z distributed/test_c10d_gloo 2022-08-17T12:40:11.7453966Z distributed/fsdp/test_fsdp_core 2022-08-17T12:40:11.7454358Z distributed/fsdp/test_fsdp_mixed_precision 2022-08-17T12:40:11.7454701Z distributed/fsdp/test_fsdp_summon_full_params 2022-08-17T12:40:11.7455006Z distributed/fsdp/test_fsdp_state_dict 2022-08-17T12:40:11.7455339Z distributed/optim/test_zero_redundancy_optimizer 2022-08-17T12:40:11.7455673Z distributed/fsdp/test_fsdp_optim_state 2022-08-17T12:40:11.7455996Z distributed/_shard/sharded_tensor/test_sharded_tensor 2022-08-17T12:40:11.7456317Z distributed/test_c10d_pypg 2022-08-17T12:40:11.7456592Z distributed/fsdp/test_wrap 2022-08-17T12:40:11.7456876Z distributed/fsdp/test_fsdp_clip_grad_norm 2022-08-17T12:40:11.7457227Z distributed/algorithms/quantization/test_quantization 2022-08-17T12:40:11.7457551Z distributed/test_pg_wrapper 2022-08-17T12:40:11.7457825Z distributed/fsdp/test_fsdp_misc 2022-08-17T12:40:11.7458124Z distributed/fsdp/test_fsdp_comm_hooks 2022-08-17T12:40:11.7458425Z distributed/test_c10d_spawn_nccl 2022-08-17T12:40:11.7458725Z distributed/fsdp/test_fsdp_freezing_weights 2022-08-17T12:40:11.7459036Z distributed/fsdp/test_fsdp_comm 2022-08-17T12:40:11.7459335Z distributed/fsdp/test_fsdp_exec_order 2022-08-17T12:40:11.7459654Z distributed/algorithms/ddp_comm_hooks/test_ddp_hooks 2022-08-17T12:40:11.7460007Z distributed/fsdp/test_fsdp_meta 2022-08-17T12:40:11.7460326Z distributed/fsdp/test_fsdp_ignored_modules 2022-08-17T12:40:11.7460686Z distributed/_shard/checkpoint/test_file_system_checkpoint 2022-08-17T12:40:11.7461031Z distributed/_shard/checkpoint/test_checkpoint 2022-08-17T12:40:11.7461356Z distributed/test_c10d_object_collectives 2022-08-17T12:40:11.7461713Z distributed/_shard/checkpoint/test_file_system_checkpoint_cpu 2022-08-17T12:40:11.7462073Z distributed/_shard/sharding_plan/test_sharding_plan 2022-08-17T12:40:11.7462405Z distributed/_shard/test_partial_tensor 2022-08-17T12:40:11.7462746Z distributed/_shard/sharded_tensor/ops/test_binary_cmp 2022-08-17T12:40:11.7463093Z distributed/fsdp/test_distributed_checkpoint 2022-08-17T12:40:11.7463630Z distributed/_shard/sharded_tensor/ops/test_init 2022-08-17T12:40:11.7463983Z distributed/_shard/sharded_tensor/ops/test_embedding 2022-08-17T12:40:11.7464350Z distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 2022-08-17T12:40:11.7464690Z distributed/fsdp/test_fsdp_multiple_forward 2022-08-17T12:40:11.7465014Z distributed/fsdp/test_fsdp_pure_fp16 2022-08-17T12:40:11.7465333Z distributed/elastic/timer/local_timer_test 2022-08-17T12:40:11.7465632Z distributed/fsdp/test_fsdp_uneven 2022-08-17T12:40:11.7465923Z distributed/test_data_parallel 2022-08-17T12:40:11.7466231Z distributed/elastic/utils/distributed_test 2022-08-17T12:40:11.7466734Z distributed/_shard/sharded_tensor/test_megatron_prototype 2022-08-17T12:40:11.7467090Z distributed/elastic/utils/util_test 2022-08-17T12:40:11.7467407Z distributed/fsdp/test_checkpoint_wrapper 2022-08-17T12:40:11.7467712Z distributed/_shard/checkpoint/test_utils 2022-08-17T12:40:11.7468027Z distributed/elastic/utils/logging_test 2022-08-17T12:40:11.7468317Z distributed/test_launcher 2022-08-17T12:40:11.7468616Z distributed/_shard/test_replicated_tensor 2022-08-17T12:40:11.7468909Z distributed/elastic/events/lib_test 2022-08-17T12:40:11.7469210Z distributed/fsdp/test_shard_utils 2022-08-17T12:40:11.7469523Z distributed/pipeline/sync/skip/test_gpipe 2022-08-17T12:40:11.7469830Z distributed/pipeline/sync/skip/test_leak 2022-08-17T12:40:11.7470269Z distributed/pipeline/sync/skip/test_stash_pop 2022-08-17T12:40:11.7470624Z distributed/pipeline/sync/skip/test_verify_skippables 2022-08-17T12:40:11.7470940Z distributed/pipeline/sync/test_bugs 2022-08-17T12:40:11.7471251Z distributed/pipeline/sync/test_copy 2022-08-17T12:40:11.7471569Z distributed/pipeline/sync/test_dependency 2022-08-17T12:40:11.7471880Z distributed/pipeline/sync/test_microbatch 2022-08-17T12:40:11.7472191Z distributed/pipeline/sync/test_pipe 2022-08-17T12:40:11.7472502Z distributed/pipeline/sync/test_stream 2022-08-17T12:40:11.7472794Z distributed/pipeline/sync/test_worker 2022-08-17T12:40:11.7473108Z distributed/rpc/test_tensorpipe_agent 2022-08-17T12:40:11.7488973Z Prioritized test from test file changes. 2022-08-17T12:40:11.7489463Z reordering tests for PR: 2022-08-17T12:40:11.7489981Z prioritized: ['distributed/_shard/sharded_tensor/ops/test_embedding'] 2022-08-17T12:40:11.7495623Z the rest: ['distributed/rpc/cuda/test_tensorpipe_agent', 'distributed/test_c10d_nccl', 'distributed/test_c10d_gloo', 'distributed/fsdp/test_fsdp_core', 'distributed/fsdp/test_fsdp_mixed_precision', 'distributed/fsdp/test_fsdp_summon_full_params', 'distributed/fsdp/test_fsdp_state_dict', 'distributed/optim/test_zero_redundancy_optimizer', 'distributed/fsdp/test_fsdp_optim_state', 'distributed/_shard/sharded_tensor/test_sharded_tensor', 'distributed/test_c10d_pypg', 'distributed/fsdp/test_wrap', 'distributed/fsdp/test_fsdp_clip_grad_norm', 'distributed/algorithms/quantization/test_quantization', 'distributed/test_pg_wrapper', 'distributed/fsdp/test_fsdp_misc', 'distributed/fsdp/test_fsdp_comm_hooks', 'distributed/test_c10d_spawn_nccl', 'distributed/fsdp/test_fsdp_freezing_weights', 'distributed/fsdp/test_fsdp_comm', 'distributed/fsdp/test_fsdp_exec_order', 'distributed/algorithms/ddp_comm_hooks/test_ddp_hooks', 'distributed/fsdp/test_fsdp_meta', 'distributed/fsdp/test_fsdp_ignored_modules', 'distributed/_shard/checkpoint/test_file_system_checkpoint', 'distributed/_shard/checkpoint/test_checkpoint', 'distributed/test_c10d_object_collectives', 'distributed/_shard/checkpoint/test_file_system_checkpoint_cpu', 'distributed/_shard/sharding_plan/test_sharding_plan', 'distributed/_shard/test_partial_tensor', 'distributed/_shard/sharded_tensor/ops/test_binary_cmp', 'distributed/fsdp/test_distributed_checkpoint', 'distributed/_shard/sharded_tensor/ops/test_init', 'distributed/_shard/sharded_tensor/test_sharded_tensor_reshard', 'distributed/fsdp/test_fsdp_multiple_forward', 'distributed/fsdp/test_fsdp_pure_fp16', 'distributed/elastic/timer/local_timer_test', 'distributed/fsdp/test_fsdp_uneven', 'distributed/test_data_parallel', 'distributed/elastic/utils/distributed_test', 'distributed/_shard/sharded_tensor/test_megatron_prototype', 'distributed/elastic/utils/util_test', 'distributed/fsdp/test_checkpoint_wrapper', 'distributed/_shard/checkpoint/test_utils', 'distributed/elastic/utils/logging_test', 'distributed/test_launcher', 'distributed/_shard/test_replicated_tensor', 'distributed/elastic/events/lib_test', 'distributed/fsdp/test_shard_utils', 'distributed/pipeline/sync/skip/test_gpipe', 'distributed/pipeline/sync/skip/test_leak', 'distributed/pipeline/sync/skip/test_stash_pop', 'distributed/pipeline/sync/skip/test_verify_skippables', 'distributed/pipeline/sync/test_bugs', 'distributed/pipeline/sync/test_copy', 'distributed/pipeline/sync/test_dependency', 'distributed/pipeline/sync/test_microbatch', 'distributed/pipeline/sync/test_pipe', 'distributed/pipeline/sync/test_stream', 'distributed/pipeline/sync/test_worker', 'distributed/rpc/test_tensorpipe_agent'] 2022-08-17T12:40:11.7499203Z 2022-08-17T12:40:11.7499757Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-08-17T12:40:11.7902351Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-08-17T12:40:11.8291845Z Running distributed/_shard/sharded_tensor/ops/test_embedding ... [2022-08-17 12:40:11.828790] 2022-08-17T12:40:11.8292928Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_embedding.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 12:40:11.828848] 2022-08-17T12:40:13.3899664Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_embedding 2022-08-17T12:40:13.3916193Z 2022-08-17T12:40:13.3916628Z Running tests... 2022-08-17T12:40:13.3917293Z ---------------------------------------------------------------------- 2022-08-17T12:40:14.9165047Z test_sharded_embedding_colwise (__main__.TestShardedEmbedding) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:40:14.9362393Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 651 2022-08-17T12:40:14.9368416Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 652 2022-08-17T12:40:14.9374532Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 653 2022-08-17T12:40:14.9381250Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 654 2022-08-17T12:40:16.3629325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:16.3630033Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:16.3631336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:16.3632022Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:16.3681936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:16.3682670Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:16.3685118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:16.3685864Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:16.3813613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:16.3814354Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:16.3816668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:16.3817437Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:16.4080377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:16.4081076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:16.4083283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:16.4084112Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:16.5314541Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:40:16.5382443Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:40:16.5470140Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:40:16.5773800Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:40:16.9442852Z skip: Need at least 4 CUDA devices (3.552s) 2022-08-17T12:40:16.9469054Z test_sharded_embedding_rowwise (__main__.TestShardedEmbedding) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 787 2022-08-17T12:40:16.9475077Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 788 2022-08-17T12:40:16.9481221Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 789 2022-08-17T12:40:16.9487584Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 790 2022-08-17T12:40:18.3541382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:18.3541998Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:18.3543484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:18.3544575Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:18.3589300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:18.3590031Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:18.3592863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:18.3593591Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:18.4311459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:18.4312420Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:18.4313415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:18.4313870Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:18.4452265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:18.4453037Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:18.4455596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:18.4456323Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:18.5238036Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:40:18.5276359Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:40:18.6041562Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:40:18.6192284Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:40:18.9546501Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T12:40:18.9547014Z 2022-08-17T12:40:18.9547664Z ---------------------------------------------------------------------- 2022-08-17T12:40:18.9548020Z Ran 2 tests in 5.563s 2022-08-17T12:40:18.9548190Z 2022-08-17T12:40:18.9548302Z OK (skipped=2) 2022-08-17T12:40:18.9548460Z 2022-08-17T12:40:18.9548589Z Generating XML reports... 2022-08-17T12:40:18.9586747Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_embedding/TEST-TestShardedEmbedding-20220817124013.xml 2022-08-17T12:40:19.2926802Z Running distributed/rpc/cuda/test_tensorpipe_agent ... [2022-08-17 12:40:19.292212] 2022-08-17T12:40:19.2928000Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/rpc/cuda/test_tensorpipe_agent.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 12:40:19.292286] 2022-08-17T12:40:20.8671883Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd707902_ 2022-08-17T12:40:20.8672918Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd707902_/_remote_module_non_scriptable.py 2022-08-17T12:40:21.2854129Z ]> 2022-08-17T12:40:21.2854938Z test_ddp_dist_autograd_local_vs_remote_gpu (__main__.TensorPipeCudaDdpComparisonTest) 2022-08-17T12:40:21.2855800Z , <__main__.TensorPipeCudaDistAutogradTest testMethod=test_gpu_to_cpu_continuation>, <__main__.TensorPipeCudaDistAutogradTest testMethod=test_gpu_to_cpu_continuation_gpu_root>]> 2022-08-17T12:40:21.2857255Z test_gpu_simple (__main__.TensorPipeCudaDistAutogradTest) 2022-08-17T12:40:21.2857701Z test_gpu_to_cpu_continuation (__main__.TensorPipeCudaDistAutogradTest) 2022-08-17T12:40:21.2858175Z test_gpu_to_cpu_continuation_gpu_root (__main__.TensorPipeCudaDistAutogradTest) 2022-08-17T12:40:21.2859407Z , <__main__.TensorPipeCudaRemoteModuleTest testMethod=test_input_moved_to_cuda_device_script>, <__main__.TensorPipeCudaRemoteModuleTest testMethod=test_invalid_devices>, <__main__.TensorPipeCudaRemoteModuleTest testMethod=test_valid_device>]> 2022-08-17T12:40:21.2860362Z test_input_moved_to_cuda_device (__main__.TensorPipeCudaRemoteModuleTest) 2022-08-17T12:40:21.2861039Z test_input_moved_to_cuda_device_script (__main__.TensorPipeCudaRemoteModuleTest) 2022-08-17T12:40:21.2861509Z test_invalid_devices (__main__.TensorPipeCudaRemoteModuleTest) 2022-08-17T12:40:21.2861908Z test_valid_device (__main__.TensorPipeCudaRemoteModuleTest) 2022-08-17T12:40:21.2862669Z ]> 2022-08-17T12:40:21.2863146Z test_profiler_remote_cuda (__main__.TensorPipeCudaRpcTest) 2022-08-17T12:40:21.2864663Z , <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_gloo_ckpt_except_last>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_gloo_ckpt_never>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_gloo_ckpt_never_find_unused>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_nccl_ckpt_always>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_nccl_ckpt_except_last>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_nccl_ckpt_never>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_nccl_ckpt_never_find_unused>]> 2022-08-17T12:40:21.2865927Z test_basic_gloo_ckpt_always (__main__.TensorPipePipeWithDDPTest) 2022-08-17T12:40:21.2866363Z test_basic_gloo_ckpt_except_last (__main__.TensorPipePipeWithDDPTest) 2022-08-17T12:40:21.2866793Z test_basic_gloo_ckpt_never (__main__.TensorPipePipeWithDDPTest) 2022-08-17T12:40:21.2867204Z test_basic_gloo_ckpt_never_find_unused (__main__.TensorPipePipeWithDDPTest) 2022-08-17T12:40:21.2867632Z test_basic_nccl_ckpt_always (__main__.TensorPipePipeWithDDPTest) 2022-08-17T12:40:21.2868054Z test_basic_nccl_ckpt_except_last (__main__.TensorPipePipeWithDDPTest) 2022-08-17T12:40:21.2868453Z test_basic_nccl_ckpt_never (__main__.TensorPipePipeWithDDPTest) 2022-08-17T12:40:21.2868876Z test_basic_nccl_ckpt_never_find_unused (__main__.TensorPipePipeWithDDPTest) 2022-08-17T12:40:21.2883319Z , <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_async_execution_with_cuda_future>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_callback_changes_devices>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_cuda_sparse_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_cuda_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_custom_class_with_cuda_sparse_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_custom_class_with_cuda_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_list_with_cuda_sparse_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_list_with_cuda_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_device_as_device>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_device_as_int>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_device_as_str>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_device_not_cuda>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_modify_tensor_inplace>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_replace_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_value_on_bad_device>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_custom_stream>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_custom_stream_multi>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_custom_stream_nested>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_custom_stream_nested_multi>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_cpu>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_cpu_to_gpu_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_cpu_to_gpu_non_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_default_to_non_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_5>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_6>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_7>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_8>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_5>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_6>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_7>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_8>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_non_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_non_default_to_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_to_cpu_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_to_cpu_non_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_gpu>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_in_options>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_invalid_max_local_device>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_invalid_max_remote_device>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_invalid_min_device>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_many_to_one>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_loop>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_not_timeout>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_remote>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_remote_response>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_response>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_response_loop>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_multi_gpu>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_multi_gpu_self>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_one_to_many>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_remote>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_return_to_gpu>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_return_to_gpu_self>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_wrong_worker_name>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_mismatch>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_devices_option_mismatch>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_devices_option_mismatch_reverse>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_owner_rref_forward_synchronization1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_owner_rref_forward_synchronization2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_owner_rref_forward_synchronization3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_owner_rref_forward_synchronization4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_as_arg_synchronization1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_as_arg_synchronization2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_as_arg_synchronization3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_as_arg_synchronization4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_as_arg_synchronization5>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_forward_synchronization1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_forward_synchronization2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_forward_synchronization3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_forward_synchronization4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_to_here_synchronization1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_to_here_synchronization2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_to_here_synchronization3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_to_here_synchronization4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_with_unpickleable_attributes>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_tensor_view_as_return_value>]> 2022-08-17T12:40:21.2897314Z test_async_execution_nested_with_cuda_future (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2897886Z test_async_execution_with_cuda_future (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2898416Z test_cuda_future_callback_changes_devices (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2898950Z test_cuda_future_can_extract_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2899478Z test_cuda_future_can_extract_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2900010Z test_cuda_future_can_extract_custom_class_with_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2900577Z test_cuda_future_can_extract_custom_class_with_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2901140Z test_cuda_future_can_extract_list_with_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2901779Z test_cuda_future_can_extract_list_with_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2902284Z test_cuda_future_device_as_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2902779Z test_cuda_future_device_as_int (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2903508Z test_cuda_future_device_as_str (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2903997Z test_cuda_future_device_not_cuda (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2904502Z test_cuda_future_modify_tensor_inplace (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2905009Z test_cuda_future_replace_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2905512Z test_cuda_future_value_on_bad_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2905977Z test_custom_stream (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2906450Z test_custom_stream_multi (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2906932Z test_custom_stream_nested (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2907405Z test_custom_stream_nested_multi (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2907885Z test_device_map_cpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2908373Z test_device_map_cpu_to_gpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2908922Z test_device_map_cpu_to_gpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2909402Z test_device_map_gpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2909908Z test_device_map_gpu_default_to_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2910404Z test_device_map_gpu_mixed_1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2910884Z test_device_map_gpu_mixed_2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2911330Z test_device_map_gpu_mixed_3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2911809Z test_device_map_gpu_mixed_4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2912279Z test_device_map_gpu_mixed_5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2912726Z test_device_map_gpu_mixed_6 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2913200Z test_device_map_gpu_mixed_7 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2913672Z test_device_map_gpu_mixed_8 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2914152Z test_device_map_gpu_mixed_self_1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2914622Z test_device_map_gpu_mixed_self_2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2915103Z test_device_map_gpu_mixed_self_3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2915594Z test_device_map_gpu_mixed_self_4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2916058Z test_device_map_gpu_mixed_self_5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2916636Z test_device_map_gpu_mixed_self_6 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2917134Z test_device_map_gpu_mixed_self_7 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2917610Z test_device_map_gpu_mixed_self_8 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2918076Z test_device_map_gpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2918587Z test_device_map_gpu_non_default_to_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2919097Z test_device_map_gpu_to_cpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2919584Z test_device_map_gpu_to_cpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2920139Z test_device_maps_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2920613Z test_device_maps_in_options (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2921119Z test_device_maps_invalid_max_local_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2921621Z test_device_maps_invalid_max_remote_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2922135Z test_device_maps_invalid_min_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2922625Z test_device_maps_many_to_one (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2923097Z test_device_maps_missing_config (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2923602Z test_device_maps_missing_config_loop (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2924119Z test_device_maps_missing_config_not_timeout (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2924639Z test_device_maps_missing_config_remote (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2925146Z test_device_maps_missing_config_remote_response (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2925678Z test_device_maps_missing_config_response (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2926210Z test_device_maps_missing_config_response_loop (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2926694Z test_device_maps_multi_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2927180Z test_device_maps_multi_gpu_self (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2927666Z test_device_maps_one_to_many (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2928146Z test_device_maps_remote (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2928609Z test_device_maps_return_to_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2929111Z test_device_maps_return_to_gpu_self (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2929615Z test_device_maps_wrong_worker_name (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2930086Z test_device_mismatch (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2930561Z test_devices_option_mismatch (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2931057Z test_devices_option_mismatch_reverse (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2931572Z test_owner_rref_forward_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2932075Z test_owner_rref_forward_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2932594Z test_owner_rref_forward_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2933115Z test_owner_rref_forward_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2933605Z test_rref_as_arg_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2934106Z test_rref_as_arg_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2934656Z test_rref_as_arg_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2935160Z test_rref_as_arg_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2935635Z test_rref_as_arg_synchronization5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2936138Z test_rref_forward_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2936646Z test_rref_forward_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2937133Z test_rref_forward_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2937638Z test_rref_forward_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2938140Z test_rref_to_here_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2938698Z test_rref_to_here_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2939182Z test_rref_to_here_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2939684Z test_rref_to_here_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2940196Z test_rref_with_unpickleable_attributes (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2940679Z test_tensor_view_as_return_value (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-08-17T12:40:21.2941583Z , <__main__.TensorPipeTensorPipeCudaDistAutogradTest testMethod=test_dist_autograd_sync_streams>, <__main__.TensorPipeTensorPipeCudaDistAutogradTest testMethod=test_gradients_synchronizations>]> 2022-08-17T12:40:21.2942486Z test_device_maps_backward_pass (__main__.TensorPipeTensorPipeCudaDistAutogradTest) 2022-08-17T12:40:21.2943012Z test_dist_autograd_sync_streams (__main__.TensorPipeTensorPipeCudaDistAutogradTest) 2022-08-17T12:40:21.2943766Z test_gradients_synchronizations (__main__.TensorPipeTensorPipeCudaDistAutogradTest) 2022-08-17T12:40:22.6531270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:22.6531757Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:22.6532912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:22.6533391Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:22.8272481Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpudrv0_pi 2022-08-17T12:40:22.8275005Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpudrv0_pi/_remote_module_non_scriptable.py 2022-08-17T12:40:23.2542604Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:40:23.2559401Z 2022-08-17T12:40:23.2559863Z Running tests... 2022-08-17T12:40:23.2560389Z ---------------------------------------------------------------------- 2022-08-17T12:40:24.7795017Z test_ddp_dist_autograd_local_vs_remote_gpu (__main__.TensorPipeCudaDdpComparisonTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:40:24.7971765Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 991 2022-08-17T12:40:24.7978253Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 992 2022-08-17T12:40:24.7984047Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 993 2022-08-17T12:40:24.7991139Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 994 2022-08-17T12:40:26.2034643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:26.2035608Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:26.2036797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:26.2038093Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:26.2115427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:26.2116326Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:26.2118553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:26.2119519Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:26.2564048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:26.2565250Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:26.2567021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:26.2567977Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:26.2584146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:26.2585087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:26.2586917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:26.2587877Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:26.3713319Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5zzrj0y2 2022-08-17T12:40:26.3714600Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5zzrj0y2/_remote_module_non_scriptable.py 2022-08-17T12:40:26.3789410Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4pzjqo4p 2022-08-17T12:40:26.3792108Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4pzjqo4p/_remote_module_non_scriptable.py 2022-08-17T12:40:26.4311637Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvigzha0k 2022-08-17T12:40:26.4312737Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvigzha0k/_remote_module_non_scriptable.py 2022-08-17T12:40:26.4351969Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpykcf0hsw 2022-08-17T12:40:26.4354838Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpykcf0hsw/_remote_module_non_scriptable.py 2022-08-17T12:40:26.7917778Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:40:26.8048047Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:40:26.8591794Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:40:26.8698229Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:40:27.2070359Z skip: Need at least 4 CUDA devices (3.951s) 2022-08-17T12:40:27.2070637Z 2022-08-17T12:40:27.2071048Z ---------------------------------------------------------------------- 2022-08-17T12:40:27.2071389Z Ran 1 test in 3.951s 2022-08-17T12:40:27.2071555Z 2022-08-17T12:40:27.2071668Z OK (skipped=1) 2022-08-17T12:40:27.2071806Z 2022-08-17T12:40:27.2071937Z Generating XML reports... 2022-08-17T12:40:27.2108383Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDdpComparisonTest-20220817124023.xml 2022-08-17T12:40:28.9822642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:28.9823141Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:28.9824506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:28.9825231Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:29.1582857Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp47hnawpp 2022-08-17T12:40:29.1586325Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp47hnawpp/_remote_module_non_scriptable.py 2022-08-17T12:40:29.5813080Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:40:29.5829833Z 2022-08-17T12:40:29.5830035Z Running tests... 2022-08-17T12:40:29.5830479Z ---------------------------------------------------------------------- 2022-08-17T12:40:31.1165853Z test_gpu_simple (__main__.TensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:40:31.1344143Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1162 2022-08-17T12:40:31.1351584Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1163 2022-08-17T12:40:31.1357604Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 1164 2022-08-17T12:40:31.1363595Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 1165 2022-08-17T12:40:32.5166608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:32.5167133Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:32.5168121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:32.5168581Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:32.5379081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:32.5379583Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:32.5382148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:32.5382624Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:32.5873296Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:32.5873766Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:32.5876243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:32.5876705Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:32.5882466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:32.5882923Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:32.5885755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:32.5886225Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:32.6853144Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl9_7o_uc 2022-08-17T12:40:32.6855115Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl9_7o_uc/_remote_module_non_scriptable.py 2022-08-17T12:40:32.7089583Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoqolcw9t 2022-08-17T12:40:32.7092514Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoqolcw9t/_remote_module_non_scriptable.py 2022-08-17T12:40:32.7620503Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2q4aobc1 2022-08-17T12:40:32.7621761Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2q4aobc1/_remote_module_non_scriptable.py 2022-08-17T12:40:32.7638195Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpotarn2f9 2022-08-17T12:40:32.7641064Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpotarn2f9/_remote_module_non_scriptable.py 2022-08-17T12:40:33.0985393Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:40:33.1397396Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:40:33.1868495Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:40:33.1950250Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:40:33.2006136Z fi_getinfo: -61 2022-08-17T12:40:33.2417679Z fi_getinfo: -61 2022-08-17T12:40:33.2889820Z fi_getinfo: -61 2022-08-17T12:40:33.2971480Z fi_getinfo: -61 2022-08-17T12:40:35.7491662Z ok (6.166s) 2022-08-17T12:40:35.7492039Z 2022-08-17T12:40:35.7492842Z ---------------------------------------------------------------------- 2022-08-17T12:40:35.7493197Z Ran 1 test in 6.166s 2022-08-17T12:40:35.7493367Z 2022-08-17T12:40:35.7493446Z OK 2022-08-17T12:40:35.7493585Z 2022-08-17T12:40:35.7493734Z Generating XML reports... 2022-08-17T12:40:35.7529129Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20220817124029.xml 2022-08-17T12:40:37.5352209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:37.5352717Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:37.5353621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:37.5354097Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:37.7120266Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdo07ve2a 2022-08-17T12:40:37.7122221Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdo07ve2a/_remote_module_non_scriptable.py 2022-08-17T12:40:38.1408162Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:40:38.1425072Z 2022-08-17T12:40:38.1425335Z Running tests... 2022-08-17T12:40:38.1425798Z ---------------------------------------------------------------------- 2022-08-17T12:40:39.6633478Z test_gpu_to_cpu_continuation (__main__.TensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:40:39.6819865Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1585 2022-08-17T12:40:39.6826493Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1586 2022-08-17T12:40:39.6833257Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 1587 2022-08-17T12:40:39.6840044Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 1588 2022-08-17T12:40:41.0885019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:41.0885536Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:41.0886143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:41.0886604Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:41.0949900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:41.0950357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:41.0953025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:41.0953481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:41.0970045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:41.0970503Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:41.0974022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:41.0974527Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:41.1082035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:41.1082489Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:41.1085538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:41.1086017Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:41.2555026Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0nip0o1j 2022-08-17T12:40:41.2555980Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0nip0o1j/_remote_module_non_scriptable.py 2022-08-17T12:40:41.2645280Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd1l1mp9u 2022-08-17T12:40:41.2647891Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd1l1mp9u/_remote_module_non_scriptable.py 2022-08-17T12:40:41.2660287Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1sjeyypa 2022-08-17T12:40:41.2663230Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1sjeyypa/_remote_module_non_scriptable.py 2022-08-17T12:40:41.2816776Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq9kpbamh 2022-08-17T12:40:41.2819557Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq9kpbamh/_remote_module_non_scriptable.py 2022-08-17T12:40:41.6861019Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:40:41.6962210Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:40:41.6993269Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:40:41.7141559Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:40:41.7982775Z fi_getinfo: -61 2022-08-17T12:40:41.7986612Z fi_getinfo: -61 2022-08-17T12:40:41.8012960Z fi_getinfo: -61 2022-08-17T12:40:41.8159560Z fi_getinfo: -61 2022-08-17T12:40:44.2965685Z ok (6.154s) 2022-08-17T12:40:44.2965898Z 2022-08-17T12:40:44.2966301Z ---------------------------------------------------------------------- 2022-08-17T12:40:44.2966623Z Ran 1 test in 6.154s 2022-08-17T12:40:44.2966795Z 2022-08-17T12:40:44.2966889Z OK 2022-08-17T12:40:44.2967025Z 2022-08-17T12:40:44.2967161Z Generating XML reports... 2022-08-17T12:40:44.3005303Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20220817124038.xml 2022-08-17T12:40:46.0260714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:46.0261282Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:46.0262567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:46.0263049Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:46.1920005Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptj1cj36l 2022-08-17T12:40:46.1922156Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptj1cj36l/_remote_module_non_scriptable.py 2022-08-17T12:40:46.6007153Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:40:46.6022362Z 2022-08-17T12:40:46.6022822Z Running tests... 2022-08-17T12:40:46.6023358Z ---------------------------------------------------------------------- 2022-08-17T12:40:48.0588851Z test_gpu_to_cpu_continuation_gpu_root (__main__.TensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:40:48.0763399Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2008 2022-08-17T12:40:48.0769303Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2009 2022-08-17T12:40:48.0775448Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 2010 2022-08-17T12:40:48.0781071Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 2011 2022-08-17T12:40:49.5213216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:49.5213924Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:49.5214774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:49.5215784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:49.5570821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:49.5571433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:49.5574211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:49.5574910Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:49.5623335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:49.5624039Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:49.5626798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:49.5627483Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:49.5724748Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:49.5725448Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:49.5727888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:49.5728500Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:49.6895287Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg4w0i7n7 2022-08-17T12:40:49.6896712Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg4w0i7n7/_remote_module_non_scriptable.py 2022-08-17T12:40:49.7281891Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp61w9aoxx 2022-08-17T12:40:49.7284193Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp61w9aoxx/_remote_module_non_scriptable.py 2022-08-17T12:40:49.7310971Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp67ocpp4g 2022-08-17T12:40:49.7313899Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp67ocpp4g/_remote_module_non_scriptable.py 2022-08-17T12:40:49.7416859Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdh9c62m4 2022-08-17T12:40:49.7418621Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdh9c62m4/_remote_module_non_scriptable.py 2022-08-17T12:40:50.1052830Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:40:50.1506789Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:40:50.1617447Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:40:50.1741322Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:40:50.2074772Z fi_getinfo: -61 2022-08-17T12:40:50.2527284Z fi_getinfo: -61 2022-08-17T12:40:50.2637921Z fi_getinfo: -61 2022-08-17T12:40:50.2760784Z fi_getinfo: -61 2022-08-17T12:40:52.6904733Z ok (6.088s) 2022-08-17T12:40:52.6905200Z 2022-08-17T12:40:52.6905904Z ---------------------------------------------------------------------- 2022-08-17T12:40:52.6906289Z Ran 1 test in 6.088s 2022-08-17T12:40:52.6906454Z 2022-08-17T12:40:52.6906549Z OK 2022-08-17T12:40:52.6906666Z 2022-08-17T12:40:52.6906804Z Generating XML reports... 2022-08-17T12:40:52.6940642Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20220817124046.xml 2022-08-17T12:40:54.4160376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:54.4160884Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:54.4162433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:54.4163194Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:54.5894462Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplhpwg3q2 2022-08-17T12:40:54.5897157Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplhpwg3q2/_remote_module_non_scriptable.py 2022-08-17T12:40:55.0157063Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:40:55.0172773Z 2022-08-17T12:40:55.0173236Z Running tests... 2022-08-17T12:40:55.0174159Z ---------------------------------------------------------------------- 2022-08-17T12:40:56.5135223Z test_input_moved_to_cuda_device (__main__.TensorPipeCudaRemoteModuleTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:40:56.5313281Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2431 2022-08-17T12:40:56.5319643Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2432 2022-08-17T12:40:57.9645636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:57.9646171Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:57.9647374Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:57.9647840Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:57.9779271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:40:57.9779733Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:40:57.9782398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:40:57.9782897Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:40:58.1382345Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpclrrh0_k 2022-08-17T12:40:58.1384755Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpclrrh0_k/_remote_module_non_scriptable.py 2022-08-17T12:40:58.1450867Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_jd0xb8s 2022-08-17T12:40:58.1453597Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_jd0xb8s/_remote_module_non_scriptable.py 2022-08-17T12:40:58.5643936Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:40:58.5715007Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:40:58.6662706Z fi_getinfo: -61 2022-08-17T12:40:58.6735109Z fi_getinfo: -61 2022-08-17T12:41:00.5419559Z ok (5.524s) 2022-08-17T12:41:00.5419777Z 2022-08-17T12:41:00.5420218Z ---------------------------------------------------------------------- 2022-08-17T12:41:00.5420573Z Ran 1 test in 5.525s 2022-08-17T12:41:00.5420741Z 2022-08-17T12:41:00.5420817Z OK 2022-08-17T12:41:00.5420956Z 2022-08-17T12:41:00.5421396Z Generating XML reports... 2022-08-17T12:41:00.5457586Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20220817124055.xml 2022-08-17T12:41:02.2792471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:02.2793549Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:02.2794250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:02.2794725Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:02.4469549Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsmv0y7wc 2022-08-17T12:41:02.4472182Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsmv0y7wc/_remote_module_non_scriptable.py 2022-08-17T12:41:02.8536267Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:41:02.8551605Z 2022-08-17T12:41:02.8552105Z Running tests... 2022-08-17T12:41:02.8552998Z ---------------------------------------------------------------------- 2022-08-17T12:41:04.3181482Z test_input_moved_to_cuda_device_script (__main__.TensorPipeCudaRemoteModuleTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:41:04.3359609Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2621 2022-08-17T12:41:04.3365655Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2622 2022-08-17T12:41:05.7614317Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:05.7614873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:05.7615750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:05.7616265Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:05.7783826Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:05.7784291Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:05.7787630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:05.7788109Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:05.9281691Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4n7xkbga 2022-08-17T12:41:05.9283653Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4n7xkbga/_remote_module_non_scriptable.py 2022-08-17T12:41:05.9501990Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpan85nw37 2022-08-17T12:41:05.9505609Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpan85nw37/_remote_module_non_scriptable.py 2022-08-17T12:41:06.3415193Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:41:06.3731352Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:41:06.4435430Z fi_getinfo: -61 2022-08-17T12:41:06.4752100Z fi_getinfo: -61 2022-08-17T12:41:06.6502973Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpan85nw37/_remote_module___torch___torch_testing__internal_distributed_nn_api_remote_module_test_MyModuleInterface.py 2022-08-17T12:41:06.6504236Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4n7xkbga/_remote_module___torch___torch_testing__internal_distributed_nn_api_remote_module_test_MyModuleInterface.py 2022-08-17T12:41:06.6579687Z INFO:torch.distributed.nn.jit.instantiator:Skipped writing /tmp/tmp4n7xkbga/_remote_module___torch___torch_testing__internal_distributed_nn_api_remote_module_test_MyModuleInterface.py 2022-08-17T12:41:08.4467543Z ok (5.591s) 2022-08-17T12:41:08.4467803Z 2022-08-17T12:41:08.4468202Z ---------------------------------------------------------------------- 2022-08-17T12:41:08.4468546Z Ran 1 test in 5.591s 2022-08-17T12:41:08.4468718Z 2022-08-17T12:41:08.4468811Z OK 2022-08-17T12:41:08.4468930Z 2022-08-17T12:41:08.4469073Z Generating XML reports... 2022-08-17T12:41:08.4505634Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20220817124102.xml 2022-08-17T12:41:10.2029651Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:10.2030164Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:10.2031387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:10.2031913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:10.3761165Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq0qlo_lh 2022-08-17T12:41:10.3762823Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq0qlo_lh/_remote_module_non_scriptable.py 2022-08-17T12:41:10.8043196Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:41:10.8059461Z 2022-08-17T12:41:10.8059606Z Running tests... 2022-08-17T12:41:10.8060223Z ---------------------------------------------------------------------- 2022-08-17T12:41:12.3155450Z test_invalid_devices (__main__.TensorPipeCudaRemoteModuleTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:41:12.3338827Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2827 2022-08-17T12:41:12.3344983Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2828 2022-08-17T12:41:13.7397691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:13.7398216Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:13.7399240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:13.7399760Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:13.7821835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:13.7822321Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:13.7825108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:13.7825751Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:13.9074586Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2psir92i 2022-08-17T12:41:13.9076283Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2psir92i/_remote_module_non_scriptable.py 2022-08-17T12:41:13.9501168Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy3w8wtqj 2022-08-17T12:41:13.9503681Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy3w8wtqj/_remote_module_non_scriptable.py 2022-08-17T12:41:14.3162539Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:41:14.3698893Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:41:14.4183185Z fi_getinfo: -61 2022-08-17T12:41:14.4719091Z fi_getinfo: -61 2022-08-17T12:41:14.6786141Z On WorkerInfo(id=1, name=worker1): 2022-08-17T12:41:14.6804093Z RuntimeError('CUDA error: invalid device ordinal\nCUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.\nException raised from exchangeDevice at /var/lib/jenkins/workspace/c10/cuda/impl/CUDAGuardImpl.h:34 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fd334fff3bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: + 0x147a4 (0x7fd33e5177a4 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)\nframe #2: + 0x10a96f8 (0x7fd3362e86f8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #3: + 0x29de1b5 (0x7fd337c1d1b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #4: + 0x29de34b (0x7fd337c1d34b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #5: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x107 (0x7fd3401ebcd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x1d89cc5 (0x7fd3404e2cc5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #7: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x169 (0x7fd34022aa49 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #8: + 0x108537a (0x7fd33f7de37a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x12c6 (0x7fd33fb35206 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x1f4dde3 (0x7fd3406a6de3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fd33ff441f3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x1d8c4e1 (0x7fd3404e54e1 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fd33ff441f3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #14: + 0x305b951 (0x7fd3417b4951 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #15: + 0x305befb (0x7fd3417b4efb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #16: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7fd33ff99981 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #17: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7fd33fb2d08e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #18: + 0x20fac79 (0x7fd340853c79 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #19: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7fd3400f51c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #20: + 0x34034f (0x7fd34c3a634f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #21: + 0x3407fc (0x7fd34c3a67fc in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #22: + 0x1ddc68 (0x563a6db52c68 in /opt/conda/bin/python)\nframe #23: + 0x1049f3 (0x563a6da799f3 in /opt/conda/bin/python)\nframe #24: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python)\nframe #25: + 0x104425 (0x563a6da79425 in /opt/conda/bin/python)\nframe #26: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python)\nframe #27: + 0x18fc9b (0x563a6db04c9b in /opt/conda/bin/python)\nframe #28: + 0x1052a5 (0x563a6da7a2a5 in /opt/conda/bin/python)\nframe #29: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python)\nframe #30: + 0x18fc9b (0x563a6db04c9b in /opt/conda/bin/python)\nframe #31: + 0x1052a5 (0x563a6da7a2a5 in /opt/conda/bin/python)\nframe #32: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python)\nframe #33: _PyObject_Call + 0x20a (0x563a6dabcfaa in /opt/conda/bin/python)\nframe #34: _PyEval_EvalFrameDefault + 0x26e4 (0x563a6db58774 in /opt/conda/bin/python)\nframe #35: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python)\nframe #36: _PyObject_Call + 0x20a (0x563a6dabcfaa in /opt/conda/bin/python)\nframe #37: + 0xa0ab2a (0x7fd34ca70b2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #38: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fd34ca6ed6d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #39: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fd34ca71f05 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #40: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7fd34ca72573 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #41: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7fd342c99ea4 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #42: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fd34ca71be5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #43: + 0x453a313 (0x7fd342c93313 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #44: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fd342c93f08 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #45: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fd342c8e597 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #46: + 0x456a202 (0x7fd342cc3202 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #47: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fd334fed7eb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #48: + 0xdbbf4 (0x7fd364143bf4 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #49: + 0x76db (0x7fd3847976db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #50: clone + 0x3f (0x7fd3844c061f in /lib/x86_64-linux-gnu/libc.so.6)\n') 2022-08-17T12:41:14.6814648Z Traceback (most recent call last): 2022-08-17T12:41:14.6815208Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 206, in _run_function 2022-08-17T12:41:14.6815780Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-08-17T12:41:14.6816378Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py", line 89, in _create_module 2022-08-17T12:41:14.6816774Z module.to(device) 2022-08-17T12:41:14.6817239Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 982, in to 2022-08-17T12:41:14.6817675Z return self._apply(convert) 2022-08-17T12:41:14.6818169Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 658, in _apply 2022-08-17T12:41:14.6818557Z param_applied = fn(param) 2022-08-17T12:41:14.6819034Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 980, in convert 2022-08-17T12:41:14.6819513Z return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) 2022-08-17T12:41:14.6819924Z RuntimeError: CUDA error: invalid device ordinal 2022-08-17T12:41:14.6820374Z CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. 2022-08-17T12:41:14.6820843Z For debugging consider passing CUDA_LAUNCH_BLOCKING=1. 2022-08-17T12:41:14.6821339Z Exception raised from exchangeDevice at /var/lib/jenkins/workspace/c10/cuda/impl/CUDAGuardImpl.h:34 (most recent call first): 2022-08-17T12:41:14.6822229Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fd334fff3bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:41:14.6822975Z frame #1: + 0x147a4 (0x7fd33e5177a4 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so) 2022-08-17T12:41:14.6824302Z frame #2: + 0x10a96f8 (0x7fd3362e86f8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-08-17T12:41:14.6824982Z frame #3: + 0x29de1b5 (0x7fd337c1d1b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-08-17T12:41:14.6825651Z frame #4: + 0x29de34b (0x7fd337c1d34b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-08-17T12:41:14.6826681Z frame #5: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x107 (0x7fd3401ebcd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6827549Z frame #6: + 0x1d89cc5 (0x7fd3404e2cc5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6828512Z frame #7: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x169 (0x7fd34022aa49 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6829333Z frame #8: + 0x108537a (0x7fd33f7de37a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6830315Z frame #9: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x12c6 (0x7fd33fb35206 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6831254Z frame #10: + 0x1f4dde3 (0x7fd3406a6de3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6832300Z frame #11: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fd33ff441f3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6833171Z frame #12: + 0x1d8c4e1 (0x7fd3404e54e1 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6834206Z frame #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fd33ff441f3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6835153Z frame #14: + 0x305b951 (0x7fd3417b4951 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6835804Z frame #15: + 0x305befb (0x7fd3417b4efb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6836766Z frame #16: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7fd33ff99981 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6837913Z frame #17: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7fd33fb2d08e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6838751Z frame #18: + 0x20fac79 (0x7fd340853c79 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6839751Z frame #19: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7fd3400f51c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6840608Z frame #20: + 0x34034f (0x7fd34c3a634f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.6841270Z frame #21: + 0x3407fc (0x7fd34c3a67fc in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.6841734Z frame #22: + 0x1ddc68 (0x563a6db52c68 in /opt/conda/bin/python) 2022-08-17T12:41:14.6842152Z frame #23: + 0x1049f3 (0x563a6da799f3 in /opt/conda/bin/python) 2022-08-17T12:41:14.6842562Z frame #24: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python) 2022-08-17T12:41:14.6842950Z frame #25: + 0x104425 (0x563a6da79425 in /opt/conda/bin/python) 2022-08-17T12:41:14.6843347Z frame #26: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python) 2022-08-17T12:41:14.6843758Z frame #27: + 0x18fc9b (0x563a6db04c9b in /opt/conda/bin/python) 2022-08-17T12:41:14.6844163Z frame #28: + 0x1052a5 (0x563a6da7a2a5 in /opt/conda/bin/python) 2022-08-17T12:41:14.6844545Z frame #29: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python) 2022-08-17T12:41:14.6844945Z frame #30: + 0x18fc9b (0x563a6db04c9b in /opt/conda/bin/python) 2022-08-17T12:41:14.6845349Z frame #31: + 0x1052a5 (0x563a6da7a2a5 in /opt/conda/bin/python) 2022-08-17T12:41:14.6846009Z frame #32: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python) 2022-08-17T12:41:14.6846493Z frame #33: _PyObject_Call + 0x20a (0x563a6dabcfaa in /opt/conda/bin/python) 2022-08-17T12:41:14.6846991Z frame #34: _PyEval_EvalFrameDefault + 0x26e4 (0x563a6db58774 in /opt/conda/bin/python) 2022-08-17T12:41:14.6847419Z frame #35: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python) 2022-08-17T12:41:14.6847801Z frame #36: _PyObject_Call + 0x20a (0x563a6dabcfaa in /opt/conda/bin/python) 2022-08-17T12:41:14.6848430Z frame #37: + 0xa0ab2a (0x7fd34ca70b2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.6849253Z frame #38: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fd34ca6ed6d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.6850293Z frame #39: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fd34ca71f05 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.6851520Z frame #40: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7fd34ca72573 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.6852783Z frame #41: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7fd342c99ea4 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6854099Z frame #42: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fd34ca71be5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.6855021Z frame #43: + 0x453a313 (0x7fd342c93313 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6855976Z frame #44: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fd342c93f08 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6857085Z frame #45: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fd342c8e597 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6857883Z frame #46: + 0x456a202 (0x7fd342cc3202 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6858583Z frame #47: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fd334fed7eb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:41:14.6859106Z frame #48: + 0xdbbf4 (0x7fd364143bf4 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-08-17T12:41:14.6859665Z frame #49: + 0x76db (0x7fd3847976db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-08-17T12:41:14.6860163Z frame #50: clone + 0x3f (0x7fd3844c061f in /lib/x86_64-linux-gnu/libc.so.6) 2022-08-17T12:41:14.6860395Z 2022-08-17T12:41:14.6860414Z 2022-08-17T12:41:14.6860552Z On WorkerInfo(id=1, name=worker1): 2022-08-17T12:41:14.6897954Z RuntimeError('On WorkerInfo(id=1, name=worker1):\nRuntimeError(\'CUDA error: invalid device ordinal\nCUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.\nException raised from exchangeDevice at /var/lib/jenkins/workspace/c10/cuda/impl/CUDAGuardImpl.h:34 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fd334fff3bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: + 0x147a4 (0x7fd33e5177a4 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)\nframe #2: + 0x10a96f8 (0x7fd3362e86f8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #3: + 0x29de1b5 (0x7fd337c1d1b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #4: + 0x29de34b (0x7fd337c1d34b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #5: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x107 (0x7fd3401ebcd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x1d89cc5 (0x7fd3404e2cc5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #7: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x169 (0x7fd34022aa49 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #8: + 0x108537a (0x7fd33f7de37a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x12c6 (0x7fd33fb35206 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x1f4dde3 (0x7fd3406a6de3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fd33ff441f3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x1d8c4e1 (0x7fd3404e54e1 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fd33ff441f3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #14: + 0x305b951 (0x7fd3417b4951 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #15: + 0x305befb (0x7fd3417b4efb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #16: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7fd33ff99981 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #17: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7fd33fb2d08e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #18: + 0x20fac79 (0x7fd340853c79 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #19: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7fd3400f51c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #20: + 0x34034f (0x7fd34c3a634f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #21: + 0x3407fc (0x7fd34c3a67fc in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #22: + 0x1ddc68 (0x563a6db52c68 in /opt/conda/bin/python)\nframe #23: + 0x1049f3 (0x563a6da799f3 in /opt/conda/bin/python)\nframe #24: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python)\nframe #25: + 0x104425 (0x563a6da79425 in /opt/conda/bin/python)\nframe #26: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python)\nframe #27: + 0x18fc9b (0x563a6db04c9b in /opt/conda/bin/python)\nframe #28: + 0x1052a5 (0x563a6da7a2a5 in /opt/conda/bin/python)\nframe #29: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python)\nframe #30: + 0x18fc9b (0x563a6db04c9b in /opt/conda/bin/python)\nframe #31: + 0x1052a5 (0x563a6da7a2a5 in /opt/conda/bin/python)\nframe #32: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python)\nframe #33: _PyObject_Call + 0x20a (0x563a6dabcfaa in /opt/conda/bin/python)\nframe #34: _PyEval_EvalFrameDefault + 0x26e4 (0x563a6db58774 in /opt/conda/bin/python)\nframe #35: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python)\nframe #36: _PyObject_Call + 0x20a (0x563a6dabcfaa in /opt/conda/bin/python)\nframe #37: + 0xa0ab2a (0x7fd34ca70b2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #38: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fd34ca6ed6d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #39: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fd34ca71f05 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #40: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7fd34ca72573 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #41: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7fd342c99ea4 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #42: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fd34ca71be5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #43: + 0x453a313 (0x7fd342c93313 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #44: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fd342c93f08 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #45: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fd342c8e597 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #46: + 0x456a202 (0x7fd342cc3202 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #47: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fd334fed7eb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #48: + 0xdbbf4 (0x7fd364143bf4 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #49: + 0x76db (0x7fd3847976db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #50: clone + 0x3f (0x7fd3844c061f in /lib/x86_64-linux-gnu/libc.so.6)\n\')\nTraceback (most recent call last):\n File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 206, in _run_function\n result = python_udf.func(*python_udf.args, **python_udf.kwargs)\n File "/opt/conda/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py", line 89, in _create_module\n module.to(device)\n File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 982, in to\n return self._apply(convert)\n File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 658, in _apply\n param_applied = fn(param)\n File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 980, in convert\n return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)\nRuntimeError: CUDA error: invalid device ordinal\nCUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.\nException raised from exchangeDevice at /var/lib/jenkins/workspace/c10/cuda/impl/CUDAGuardImpl.h:34 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fd334fff3bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: + 0x147a4 (0x7fd33e5177a4 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)\nframe #2: + 0x10a96f8 (0x7fd3362e86f8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #3: + 0x29de1b5 (0x7fd337c1d1b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #4: + 0x29de34b (0x7fd337c1d34b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #5: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x107 (0x7fd3401ebcd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x1d89cc5 (0x7fd3404e2cc5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #7: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x169 (0x7fd34022aa49 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #8: + 0x108537a (0x7fd33f7de37a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x12c6 (0x7fd33fb35206 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x1f4dde3 (0x7fd3406a6de3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fd33ff441f3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x1d8c4e1 (0x7fd3404e54e1 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fd33ff441f3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #14: + 0x305b951 (0x7fd3417b4951 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #15: + 0x305befb (0x7fd3417b4efb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #16: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7fd33ff99981 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #17: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7fd33fb2d08e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #18: + 0x20fac79 (0x7fd340853c79 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #19: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7fd3400f51c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #20: + 0x34034f (0x7fd34c3a634f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #21: + 0x3407fc (0x7fd34c3a67fc in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #22: + 0x1ddc68 (0x563a6db52c68 in /opt/conda/bin/python)\nframe #23: + 0x1049f3 (0x563a6da799f3 in /opt/conda/bin/python)\nframe #24: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python)\nframe #25: + 0x104425 (0x563a6da79425 in /opt/conda/bin/python)\nframe #26: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python)\nframe #27: + 0x18fc9b (0x563a6db04c9b in /opt/conda/bin/python)\nframe #28: + 0x1052a5 (0x563a6da7a2a5 in /opt/conda/bin/python)\nframe #29: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python)\nframe #30: + 0x18fc9b (0x563a6db04c9b in /opt/conda/bin/python)\nframe #31: + 0x1052a5 (0x563a6da7a2a5 in /opt/conda/bin/python)\nframe #32: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python)\nframe #33: _PyObject_Call + 0x20a (0x563a6dabcfaa in /opt/conda/bin/python)\nframe #34: _PyEval_EvalFrameDefault + 0x26e4 (0x563a6db58774 in /opt/conda/bin/python)\nframe #35: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python)\nframe #36: _PyObject_Call + 0x20a (0x563a6dabcfaa in /opt/conda/bin/python)\nframe #37: + 0xa0ab2a (0x7fd34ca70b2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #38: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fd34ca6ed6d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #39: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fd34ca71f05 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #40: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7fd34ca72573 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #41: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7fd342c99ea4 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #42: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fd34ca71be5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #43: + 0x453a313 (0x7fd342c93313 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #44: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fd342c93f08 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #45: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fd342c8e597 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #46: + 0x456a202 (0x7fd342cc3202 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #47: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fd334fed7eb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #48: + 0xdbbf4 (0x7fd364143bf4 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #49: + 0x76db (0x7fd3847976db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #50: clone + 0x3f (0x7fd3844c061f in /lib/x86_64-linux-gnu/libc.so.6)\n\n') 2022-08-17T12:41:14.6920648Z Traceback (most recent call last): 2022-08-17T12:41:14.6921196Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 206, in _run_function 2022-08-17T12:41:14.6921677Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-08-17T12:41:14.6922116Z File "/tmp/tmpq0qlo_lh/_remote_module_non_scriptable.py", line 47, in _remote_forward 2022-08-17T12:41:14.6922498Z module = module_rref.local_value() 2022-08-17T12:41:14.6923025Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 220, in _handle_exception 2022-08-17T12:41:14.6923623Z raise result.exception_type(result.msg.encode("utf-8").decode("unicode_escape")) 2022-08-17T12:41:14.6924034Z RuntimeError: On WorkerInfo(id=1, name=worker1): 2022-08-17T12:41:14.6924425Z RuntimeError('CUDA error: invalid device ordinal 2022-08-17T12:41:14.6924889Z CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. 2022-08-17T12:41:14.6925358Z For debugging consider passing CUDA_LAUNCH_BLOCKING=1. 2022-08-17T12:41:14.6925852Z Exception raised from exchangeDevice at /var/lib/jenkins/workspace/c10/cuda/impl/CUDAGuardImpl.h:34 (most recent call first): 2022-08-17T12:41:14.6926715Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fd334fff3bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:41:14.6927470Z frame #1: + 0x147a4 (0x7fd33e5177a4 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so) 2022-08-17T12:41:14.6928140Z frame #2: + 0x10a96f8 (0x7fd3362e86f8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-08-17T12:41:14.6928818Z frame #3: + 0x29de1b5 (0x7fd337c1d1b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-08-17T12:41:14.6929484Z frame #4: + 0x29de34b (0x7fd337c1d34b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-08-17T12:41:14.6930522Z frame #5: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x107 (0x7fd3401ebcd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6931402Z frame #6: + 0x1d89cc5 (0x7fd3404e2cc5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6932464Z frame #7: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x169 (0x7fd34022aa49 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6933304Z frame #8: + 0x108537a (0x7fd33f7de37a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6934268Z frame #9: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x12c6 (0x7fd33fb35206 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6935104Z frame #10: + 0x1f4dde3 (0x7fd3406a6de3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6936153Z frame #11: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fd33ff441f3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6937107Z frame #12: + 0x1d8c4e1 (0x7fd3404e54e1 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6938147Z frame #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fd33ff441f3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6939022Z frame #14: + 0x305b951 (0x7fd3417b4951 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6939657Z frame #15: + 0x305befb (0x7fd3417b4efb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6940642Z frame #16: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7fd33ff99981 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6941794Z frame #17: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7fd33fb2d08e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6942631Z frame #18: + 0x20fac79 (0x7fd340853c79 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6944270Z frame #19: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7fd3400f51c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6945144Z frame #20: + 0x34034f (0x7fd34c3a634f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.6945816Z frame #21: + 0x3407fc (0x7fd34c3a67fc in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.6946300Z frame #22: + 0x1ddc68 (0x563a6db52c68 in /opt/conda/bin/python) 2022-08-17T12:41:14.6946720Z frame #23: + 0x1049f3 (0x563a6da799f3 in /opt/conda/bin/python) 2022-08-17T12:41:14.6947113Z frame #24: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python) 2022-08-17T12:41:14.6947518Z frame #25: + 0x104425 (0x563a6da79425 in /opt/conda/bin/python) 2022-08-17T12:41:14.6947919Z frame #26: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python) 2022-08-17T12:41:14.6948331Z frame #27: + 0x18fc9b (0x563a6db04c9b in /opt/conda/bin/python) 2022-08-17T12:41:14.6948802Z frame #28: + 0x1052a5 (0x563a6da7a2a5 in /opt/conda/bin/python) 2022-08-17T12:41:14.6949215Z frame #29: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python) 2022-08-17T12:41:14.6949625Z frame #30: + 0x18fc9b (0x563a6db04c9b in /opt/conda/bin/python) 2022-08-17T12:41:14.6950018Z frame #31: + 0x1052a5 (0x563a6da7a2a5 in /opt/conda/bin/python) 2022-08-17T12:41:14.6950423Z frame #32: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python) 2022-08-17T12:41:14.6950832Z frame #33: _PyObject_Call + 0x20a (0x563a6dabcfaa in /opt/conda/bin/python) 2022-08-17T12:41:14.6951264Z frame #34: _PyEval_EvalFrameDefault + 0x26e4 (0x563a6db58774 in /opt/conda/bin/python) 2022-08-17T12:41:14.6951669Z frame #35: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python) 2022-08-17T12:41:14.6952142Z frame #36: _PyObject_Call + 0x20a (0x563a6dabcfaa in /opt/conda/bin/python) 2022-08-17T12:41:14.6952765Z frame #37: + 0xa0ab2a (0x7fd34ca70b2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.6953575Z frame #38: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fd34ca6ed6d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.6954614Z frame #39: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fd34ca71f05 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.6955793Z frame #40: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7fd34ca72573 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.6957071Z frame #41: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7fd342c99ea4 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6958389Z frame #42: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fd34ca71be5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.6959307Z frame #43: + 0x453a313 (0x7fd342c93313 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6960243Z frame #44: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fd342c93f08 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6961355Z frame #45: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fd342c8e597 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6962165Z frame #46: + 0x456a202 (0x7fd342cc3202 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6962873Z frame #47: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fd334fed7eb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:41:14.6963403Z frame #48: + 0xdbbf4 (0x7fd364143bf4 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-08-17T12:41:14.6963952Z frame #49: + 0x76db (0x7fd3847976db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-08-17T12:41:14.6964470Z frame #50: clone + 0x3f (0x7fd3844c061f in /lib/x86_64-linux-gnu/libc.so.6) 2022-08-17T12:41:14.6964799Z ') 2022-08-17T12:41:14.6965099Z Traceback (most recent call last): 2022-08-17T12:41:14.6965642Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 206, in _run_function 2022-08-17T12:41:14.6966111Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-08-17T12:41:14.6966701Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py", line 89, in _create_module 2022-08-17T12:41:14.6967082Z module.to(device) 2022-08-17T12:41:14.6967545Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 982, in to 2022-08-17T12:41:14.6967923Z return self._apply(convert) 2022-08-17T12:41:14.6968391Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 658, in _apply 2022-08-17T12:41:14.6968832Z param_applied = fn(param) 2022-08-17T12:41:14.6969324Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 980, in convert 2022-08-17T12:41:14.6969812Z return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) 2022-08-17T12:41:14.6970208Z RuntimeError: CUDA error: invalid device ordinal 2022-08-17T12:41:14.6970674Z CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. 2022-08-17T12:41:14.6971143Z For debugging consider passing CUDA_LAUNCH_BLOCKING=1. 2022-08-17T12:41:14.6971624Z Exception raised from exchangeDevice at /var/lib/jenkins/workspace/c10/cuda/impl/CUDAGuardImpl.h:34 (most recent call first): 2022-08-17T12:41:14.6972495Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fd334fff3bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:41:14.6973257Z frame #1: + 0x147a4 (0x7fd33e5177a4 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so) 2022-08-17T12:41:14.6973925Z frame #2: + 0x10a96f8 (0x7fd3362e86f8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-08-17T12:41:14.6974579Z frame #3: + 0x29de1b5 (0x7fd337c1d1b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-08-17T12:41:14.6975252Z frame #4: + 0x29de34b (0x7fd337c1d34b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-08-17T12:41:14.6976281Z frame #5: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x107 (0x7fd3401ebcd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6977143Z frame #6: + 0x1d89cc5 (0x7fd3404e2cc5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6978116Z frame #7: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x169 (0x7fd34022aa49 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6978918Z frame #8: + 0x108537a (0x7fd33f7de37a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6979895Z frame #9: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x12c6 (0x7fd33fb35206 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6980736Z frame #10: + 0x1f4dde3 (0x7fd3406a6de3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6981836Z frame #11: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fd33ff441f3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6982716Z frame #12: + 0x1d8c4e1 (0x7fd3404e54e1 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6984392Z frame #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fd33ff441f3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6985282Z frame #14: + 0x305b951 (0x7fd3417b4951 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6986026Z frame #15: + 0x305befb (0x7fd3417b4efb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6987038Z frame #16: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7fd33ff99981 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6988190Z frame #17: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7fd33fb2d08e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6989023Z frame #18: + 0x20fac79 (0x7fd340853c79 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6990014Z frame #19: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7fd3400f51c5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.6990879Z frame #20: + 0x34034f (0x7fd34c3a634f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.6991547Z frame #21: + 0x3407fc (0x7fd34c3a67fc in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.6992029Z frame #22: + 0x1ddc68 (0x563a6db52c68 in /opt/conda/bin/python) 2022-08-17T12:41:14.6992433Z frame #23: + 0x1049f3 (0x563a6da799f3 in /opt/conda/bin/python) 2022-08-17T12:41:14.6992841Z frame #24: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python) 2022-08-17T12:41:14.6993248Z frame #25: + 0x104425 (0x563a6da79425 in /opt/conda/bin/python) 2022-08-17T12:41:14.6993655Z frame #26: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python) 2022-08-17T12:41:14.6994046Z frame #27: + 0x18fc9b (0x563a6db04c9b in /opt/conda/bin/python) 2022-08-17T12:41:14.6994459Z frame #28: + 0x1052a5 (0x563a6da7a2a5 in /opt/conda/bin/python) 2022-08-17T12:41:14.6994864Z frame #29: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python) 2022-08-17T12:41:14.6995251Z frame #30: + 0x18fc9b (0x563a6db04c9b in /opt/conda/bin/python) 2022-08-17T12:41:14.6995657Z frame #31: + 0x1052a5 (0x563a6da7a2a5 in /opt/conda/bin/python) 2022-08-17T12:41:14.6996057Z frame #32: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python) 2022-08-17T12:41:14.6996461Z frame #33: _PyObject_Call + 0x20a (0x563a6dabcfaa in /opt/conda/bin/python) 2022-08-17T12:41:14.6996874Z frame #34: _PyEval_EvalFrameDefault + 0x26e4 (0x563a6db58774 in /opt/conda/bin/python) 2022-08-17T12:41:14.6997307Z frame #35: + 0x18f742 (0x563a6db04742 in /opt/conda/bin/python) 2022-08-17T12:41:14.6997711Z frame #36: _PyObject_Call + 0x20a (0x563a6dabcfaa in /opt/conda/bin/python) 2022-08-17T12:41:14.6998389Z frame #37: + 0xa0ab2a (0x7fd34ca70b2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.6999227Z frame #38: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fd34ca6ed6d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.7000271Z frame #39: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fd34ca71f05 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.7001449Z frame #40: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7fd34ca72573 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.7002781Z frame #41: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7fd342c99ea4 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.7004103Z frame #42: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fd34ca71be5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:41:14.7005007Z frame #43: + 0x453a313 (0x7fd342c93313 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.7006329Z frame #44: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fd342c93f08 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.7008130Z frame #45: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fd342c8e597 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.7009562Z frame #46: + 0x456a202 (0x7fd342cc3202 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:41:14.7010280Z frame #47: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fd334fed7eb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:41:14.7010929Z frame #48: + 0xdbbf4 (0x7fd364143bf4 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-08-17T12:41:14.7011576Z frame #49: + 0x76db (0x7fd3847976db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-08-17T12:41:14.7012099Z frame #50: clone + 0x3f (0x7fd3844c061f in /lib/x86_64-linux-gnu/libc.so.6) 2022-08-17T12:41:14.7012333Z 2022-08-17T12:41:14.7012351Z 2022-08-17T12:41:14.7012369Z 2022-08-17T12:41:15.0414935Z ok (4.235s) 2022-08-17T12:41:15.0415164Z 2022-08-17T12:41:15.0415540Z ---------------------------------------------------------------------- 2022-08-17T12:41:15.0415880Z Ran 1 test in 4.235s 2022-08-17T12:41:15.0416051Z 2022-08-17T12:41:15.0416147Z OK 2022-08-17T12:41:15.0416293Z 2022-08-17T12:41:15.0416432Z Generating XML reports... 2022-08-17T12:41:15.0455883Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20220817124110.xml 2022-08-17T12:41:16.7885318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:16.7885890Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:16.7889284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:16.7889791Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:16.9689556Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgaeqahyl 2022-08-17T12:41:16.9691843Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgaeqahyl/_remote_module_non_scriptable.py 2022-08-17T12:41:17.3818382Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:41:17.3834246Z 2022-08-17T12:41:17.3834458Z Running tests... 2022-08-17T12:41:17.3834898Z ---------------------------------------------------------------------- 2022-08-17T12:41:18.8572269Z test_valid_device (__main__.TensorPipeCudaRemoteModuleTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:41:18.8750492Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3016 2022-08-17T12:41:18.8756744Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3017 2022-08-17T12:41:20.2759238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:20.2759746Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:20.2760922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:20.2761418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:20.3030122Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:20.3030596Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:20.3033160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:20.3033673Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:20.4499903Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzxt9vrp0 2022-08-17T12:41:20.4501544Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzxt9vrp0/_remote_module_non_scriptable.py 2022-08-17T12:41:20.4704037Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpifs_1smg 2022-08-17T12:41:20.4706920Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpifs_1smg/_remote_module_non_scriptable.py 2022-08-17T12:41:20.8776150Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:41:20.8850986Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:41:20.9797069Z fi_getinfo: -61 2022-08-17T12:41:20.9873585Z fi_getinfo: -61 2022-08-17T12:41:22.7859231Z ok (5.402s) 2022-08-17T12:41:22.7859502Z 2022-08-17T12:41:22.7859902Z ---------------------------------------------------------------------- 2022-08-17T12:41:22.7860248Z Ran 1 test in 5.402s 2022-08-17T12:41:22.7860440Z 2022-08-17T12:41:22.7860545Z OK 2022-08-17T12:41:22.7860684Z 2022-08-17T12:41:22.7860807Z Generating XML reports... 2022-08-17T12:41:22.7897637Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20220817124117.xml 2022-08-17T12:41:24.5673816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:24.5674324Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:24.5675172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:24.5675670Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:24.7457959Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppby4ermj 2022-08-17T12:41:24.7460824Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppby4ermj/_remote_module_non_scriptable.py 2022-08-17T12:41:25.1750518Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:41:25.1766875Z 2022-08-17T12:41:25.1767107Z Running tests... 2022-08-17T12:41:25.1767529Z ---------------------------------------------------------------------- 2022-08-17T12:41:26.7079148Z test_profiler_remote_cuda (__main__.TensorPipeCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:41:26.7265244Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3206 2022-08-17T12:41:26.7271937Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3207 2022-08-17T12:41:26.7278269Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 3208 2022-08-17T12:41:26.7284804Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 3209 2022-08-17T12:41:28.1281528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:28.1282068Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:28.1282668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:28.1283163Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:28.1482900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:28.1483388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:28.1486168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:28.1486683Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:28.1489834Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:28.1490324Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:28.1493335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:28.1493830Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:28.1525155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:28.1525633Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:28.1528565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:28.1529062Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:28.2949970Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_04183_7 2022-08-17T12:41:28.2951981Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_04183_7/_remote_module_non_scriptable.py 2022-08-17T12:41:28.3265485Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5ed3g1a_ 2022-08-17T12:41:28.3268712Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5ed3g1a_/_remote_module_non_scriptable.py 2022-08-17T12:41:28.3301175Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzw7ti4lf 2022-08-17T12:41:28.3304083Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzw7ti4lf/_remote_module_non_scriptable.py 2022-08-17T12:41:28.3366658Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6b2srev1 2022-08-17T12:41:28.3368701Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6b2srev1/_remote_module_non_scriptable.py 2022-08-17T12:41:28.7137317Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:41:28.7516674Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:41:28.7668605Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:41:28.7747630Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:41:28.8157607Z fi_getinfo: -61 2022-08-17T12:41:28.8534622Z fi_getinfo: -61 2022-08-17T12:41:28.8687958Z fi_getinfo: -61 2022-08-17T12:41:28.8765813Z fi_getinfo: -61 2022-08-17T12:41:33.3457854Z ok (8.169s) 2022-08-17T12:41:33.3458065Z 2022-08-17T12:41:33.3458476Z ---------------------------------------------------------------------- 2022-08-17T12:41:33.3458804Z Ran 1 test in 8.169s 2022-08-17T12:41:33.3458974Z 2022-08-17T12:41:33.3459072Z OK 2022-08-17T12:41:33.3459214Z 2022-08-17T12:41:33.3459673Z Generating XML reports... 2022-08-17T12:41:33.3496416Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRpcTest-20220817124125.xml 2022-08-17T12:41:35.0789623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:35.0790630Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:35.0791812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:35.0792721Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:35.2521424Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8d88v4m4 2022-08-17T12:41:35.2523475Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8d88v4m4/_remote_module_non_scriptable.py 2022-08-17T12:41:35.6760960Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:41:35.6777845Z 2022-08-17T12:41:35.6778307Z Running tests... 2022-08-17T12:41:35.6778811Z ---------------------------------------------------------------------- 2022-08-17T12:41:37.1772723Z test_basic_gloo_ckpt_always (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:41:37.1950995Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3553 2022-08-17T12:41:37.1956801Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3554 2022-08-17T12:41:38.5732818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:38.5733332Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:38.5734192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:38.5734686Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:38.6028629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:38.6029110Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:38.6032189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:38.6032687Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:38.7422420Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0mq9h6ra 2022-08-17T12:41:38.7424552Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0mq9h6ra/_remote_module_non_scriptable.py 2022-08-17T12:41:38.7781693Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpig8r3rms 2022-08-17T12:41:38.7784053Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpig8r3rms/_remote_module_non_scriptable.py 2022-08-17T12:41:39.1532558Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:41:39.2042664Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:41:39.5018727Z skip: Need at least 4 CUDA devices (3.824s) 2022-08-17T12:41:39.5019007Z 2022-08-17T12:41:39.5019378Z ---------------------------------------------------------------------- 2022-08-17T12:41:39.5019728Z Ran 1 test in 3.824s 2022-08-17T12:41:39.5019898Z 2022-08-17T12:41:39.5020010Z OK (skipped=1) 2022-08-17T12:41:39.5020173Z 2022-08-17T12:41:39.5020304Z Generating XML reports... 2022-08-17T12:41:39.5056719Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220817124135.xml 2022-08-17T12:41:41.2314802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:41.2315309Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:41.2316596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:41.2317115Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:41.4012571Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpct3ok2sk 2022-08-17T12:41:41.4014003Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpct3ok2sk/_remote_module_non_scriptable.py 2022-08-17T12:41:41.8110029Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:41:41.8125262Z 2022-08-17T12:41:41.8125523Z Running tests... 2022-08-17T12:41:41.8125945Z ---------------------------------------------------------------------- 2022-08-17T12:41:43.2770897Z test_basic_gloo_ckpt_except_last (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:41:43.2959111Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3656 2022-08-17T12:41:43.2964926Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3657 2022-08-17T12:41:44.6654865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:44.6656407Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:44.6657040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:44.6657540Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:44.6926513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:44.6926970Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:44.6929583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:44.6930090Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:44.8325476Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdd0lhoo_ 2022-08-17T12:41:44.8327148Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdd0lhoo_/_remote_module_non_scriptable.py 2022-08-17T12:41:44.8646621Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9c9_r26a 2022-08-17T12:41:44.8649646Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9c9_r26a/_remote_module_non_scriptable.py 2022-08-17T12:41:45.2400870Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:41:45.2869523Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:41:45.6025194Z skip: Need at least 4 CUDA devices (3.790s) 2022-08-17T12:41:45.6025460Z 2022-08-17T12:41:45.6025842Z ---------------------------------------------------------------------- 2022-08-17T12:41:45.6026185Z Ran 1 test in 3.790s 2022-08-17T12:41:45.6026354Z 2022-08-17T12:41:45.6026466Z OK (skipped=1) 2022-08-17T12:41:45.6028691Z 2022-08-17T12:41:45.6029388Z Generating XML reports... 2022-08-17T12:41:45.6061717Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220817124141.xml 2022-08-17T12:41:47.3695388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:47.3695886Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:47.3697073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:47.3697580Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:47.5456019Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9k2ci996 2022-08-17T12:41:47.5458775Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9k2ci996/_remote_module_non_scriptable.py 2022-08-17T12:41:47.9561918Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:41:47.9577346Z 2022-08-17T12:41:47.9577591Z Running tests... 2022-08-17T12:41:47.9578038Z ---------------------------------------------------------------------- 2022-08-17T12:41:49.4372417Z test_basic_gloo_ckpt_never (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:41:49.4549262Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3759 2022-08-17T12:41:49.4555537Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3760 2022-08-17T12:41:50.8293653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:50.8294198Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:50.8295041Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:50.8295548Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:50.8586906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:50.8587443Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:50.8589464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:50.8589962Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:50.9963726Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjn871t2x 2022-08-17T12:41:50.9965885Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjn871t2x/_remote_module_non_scriptable.py 2022-08-17T12:41:51.0318658Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp08t3gqs9 2022-08-17T12:41:51.0321408Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp08t3gqs9/_remote_module_non_scriptable.py 2022-08-17T12:41:51.4030035Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:41:51.4562436Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:41:51.7616045Z skip: Need at least 4 CUDA devices (3.804s) 2022-08-17T12:41:51.7616289Z 2022-08-17T12:41:51.7616679Z ---------------------------------------------------------------------- 2022-08-17T12:41:51.7617020Z Ran 1 test in 3.804s 2022-08-17T12:41:51.7617186Z 2022-08-17T12:41:51.7617282Z OK (skipped=1) 2022-08-17T12:41:51.7617442Z 2022-08-17T12:41:51.7617572Z Generating XML reports... 2022-08-17T12:41:51.7653524Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220817124147.xml 2022-08-17T12:41:53.4973664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:53.4974485Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:53.4975298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:53.4975796Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:53.6714281Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8t3gse36 2022-08-17T12:41:53.6716864Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8t3gse36/_remote_module_non_scriptable.py 2022-08-17T12:41:54.0940505Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:41:54.0956562Z 2022-08-17T12:41:54.0956785Z Running tests... 2022-08-17T12:41:54.0957499Z ---------------------------------------------------------------------- 2022-08-17T12:41:55.6118589Z test_basic_gloo_ckpt_never_find_unused (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:41:55.6301450Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3862 2022-08-17T12:41:55.6307845Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3863 2022-08-17T12:41:57.0058477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:57.0058984Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:57.0060182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:57.0060677Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:57.0759641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:57.0760124Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:57.0762562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:57.0763055Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:57.1755352Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo1vvwddx 2022-08-17T12:41:57.1756219Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo1vvwddx/_remote_module_non_scriptable.py 2022-08-17T12:41:57.2519550Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxdjr6q8w 2022-08-17T12:41:57.2520418Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxdjr6q8w/_remote_module_non_scriptable.py 2022-08-17T12:41:57.5923727Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:41:57.6763731Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:41:58.0371864Z skip: Need at least 4 CUDA devices (3.941s) 2022-08-17T12:41:58.0372110Z 2022-08-17T12:41:58.0372489Z ---------------------------------------------------------------------- 2022-08-17T12:41:58.0372857Z Ran 1 test in 3.941s 2022-08-17T12:41:58.0373024Z 2022-08-17T12:41:58.0373134Z OK (skipped=1) 2022-08-17T12:41:58.0373293Z 2022-08-17T12:41:58.0373421Z Generating XML reports... 2022-08-17T12:41:58.0409143Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220817124154.xml 2022-08-17T12:41:59.8034840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:41:59.8035346Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:41:59.8036403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:41:59.8036924Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:41:59.9770912Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprhz3b6u6 2022-08-17T12:41:59.9773255Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprhz3b6u6/_remote_module_non_scriptable.py 2022-08-17T12:42:00.4058121Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:42:00.4074518Z 2022-08-17T12:42:00.4075070Z Running tests... 2022-08-17T12:42:00.4075686Z ---------------------------------------------------------------------- 2022-08-17T12:42:01.9204898Z test_basic_nccl_ckpt_always (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:42:01.9389276Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3965 2022-08-17T12:42:01.9395565Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3966 2022-08-17T12:42:03.3119252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:03.3119957Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:03.3120774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:03.3121491Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:03.3163843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:03.3164531Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:03.3167278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:03.3168000Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:03.4797223Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp373ys7z3 2022-08-17T12:42:03.4798069Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp373ys7z3/_remote_module_non_scriptable.py 2022-08-17T12:42:03.4845259Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzhwrjab6 2022-08-17T12:42:03.4848092Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzhwrjab6/_remote_module_non_scriptable.py 2022-08-17T12:42:03.8925695Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:42:03.9036339Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:42:04.2456707Z skip: Need at least 4 CUDA devices (3.838s) 2022-08-17T12:42:04.2456954Z 2022-08-17T12:42:04.2457336Z ---------------------------------------------------------------------- 2022-08-17T12:42:04.2457680Z Ran 1 test in 3.838s 2022-08-17T12:42:04.2457874Z 2022-08-17T12:42:04.2457976Z OK (skipped=1) 2022-08-17T12:42:04.2458137Z 2022-08-17T12:42:04.2458264Z Generating XML reports... 2022-08-17T12:42:04.2494060Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220817124200.xml 2022-08-17T12:42:06.0010188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:06.0010705Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:06.0012277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:06.0012788Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:06.1689770Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc60fezgl 2022-08-17T12:42:06.1692437Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc60fezgl/_remote_module_non_scriptable.py 2022-08-17T12:42:06.5760624Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:42:06.5776295Z 2022-08-17T12:42:06.5776642Z Running tests... 2022-08-17T12:42:06.5777417Z ---------------------------------------------------------------------- 2022-08-17T12:42:08.0507404Z test_basic_nccl_ckpt_except_last (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:42:08.0686071Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4068 2022-08-17T12:42:08.0692141Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4069 2022-08-17T12:42:09.4503571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:09.4504267Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:09.4505068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:09.4505896Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:09.4941995Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:09.4942457Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:09.4945286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:09.4945783Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:09.6180187Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprxop_rgs 2022-08-17T12:42:09.6182891Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprxop_rgs/_remote_module_non_scriptable.py 2022-08-17T12:42:09.6669562Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp6lg56h6 2022-08-17T12:42:09.6671757Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp6lg56h6/_remote_module_non_scriptable.py 2022-08-17T12:42:10.0272673Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:42:10.0923731Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:42:10.3752084Z skip: Need at least 4 CUDA devices (3.797s) 2022-08-17T12:42:10.3752501Z 2022-08-17T12:42:10.3753053Z ---------------------------------------------------------------------- 2022-08-17T12:42:10.3753400Z Ran 1 test in 3.797s 2022-08-17T12:42:10.3753569Z 2022-08-17T12:42:10.3753681Z OK (skipped=1) 2022-08-17T12:42:10.3753821Z 2022-08-17T12:42:10.3757035Z Generating XML reports... 2022-08-17T12:42:10.3790446Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220817124206.xml 2022-08-17T12:42:12.1629866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:12.1630524Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:12.1631530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:12.1632014Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:12.3367465Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa1on_xyc 2022-08-17T12:42:12.3370122Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa1on_xyc/_remote_module_non_scriptable.py 2022-08-17T12:42:12.7640491Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:42:12.7656012Z 2022-08-17T12:42:12.7656166Z Running tests... 2022-08-17T12:42:12.7656609Z ---------------------------------------------------------------------- 2022-08-17T12:42:14.3018620Z test_basic_nccl_ckpt_never (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:42:14.3202941Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4171 2022-08-17T12:42:14.3209343Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4172 2022-08-17T12:42:15.7329024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:15.7329821Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:15.7330935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:15.7331429Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:15.7452300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:15.7452758Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:15.7455819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:15.7456322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:15.9028547Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7q0j98pc 2022-08-17T12:42:15.9031057Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7q0j98pc/_remote_module_non_scriptable.py 2022-08-17T12:42:15.9102626Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpco_si8dt 2022-08-17T12:42:15.9105604Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpco_si8dt/_remote_module_non_scriptable.py 2022-08-17T12:42:16.3196117Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:42:16.3223741Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:42:16.6269012Z skip: Need at least 4 CUDA devices (3.861s) 2022-08-17T12:42:16.6269351Z 2022-08-17T12:42:16.6269750Z ---------------------------------------------------------------------- 2022-08-17T12:42:16.6270101Z Ran 1 test in 3.861s 2022-08-17T12:42:16.6270272Z 2022-08-17T12:42:16.6270394Z OK (skipped=1) 2022-08-17T12:42:16.6270628Z 2022-08-17T12:42:16.6270854Z Generating XML reports... 2022-08-17T12:42:16.6306250Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220817124212.xml 2022-08-17T12:42:18.3857773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:18.3858485Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:18.3859291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:18.3859936Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:18.5594734Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpab2_i_1l 2022-08-17T12:42:18.5597513Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpab2_i_1l/_remote_module_non_scriptable.py 2022-08-17T12:42:18.9822740Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:42:18.9838931Z 2022-08-17T12:42:18.9839214Z Running tests... 2022-08-17T12:42:18.9839846Z ---------------------------------------------------------------------- 2022-08-17T12:42:20.5018099Z test_basic_nccl_ckpt_never_find_unused (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:42:20.5202374Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4274 2022-08-17T12:42:20.5208550Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4275 2022-08-17T12:42:21.8964022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:21.8964558Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:21.8965697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:21.8966225Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:21.9296671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:21.9297158Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:21.9299454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:21.9299950Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:22.0641694Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpajbql383 2022-08-17T12:42:22.0643886Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpajbql383/_remote_module_non_scriptable.py 2022-08-17T12:42:22.1060698Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfqyqvcuv 2022-08-17T12:42:22.1063456Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfqyqvcuv/_remote_module_non_scriptable.py 2022-08-17T12:42:22.4714087Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:42:22.5341726Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:42:22.9272530Z skip: Need at least 4 CUDA devices (3.943s) 2022-08-17T12:42:22.9272783Z 2022-08-17T12:42:22.9273138Z ---------------------------------------------------------------------- 2022-08-17T12:42:22.9273483Z Ran 1 test in 3.943s 2022-08-17T12:42:22.9273661Z 2022-08-17T12:42:22.9273775Z OK (skipped=1) 2022-08-17T12:42:22.9273937Z 2022-08-17T12:42:22.9274072Z Generating XML reports... 2022-08-17T12:42:22.9311331Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220817124218.xml 2022-08-17T12:42:24.6935420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:24.6935959Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:24.6936892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:24.6937416Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:24.8682545Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp45gh_tej 2022-08-17T12:42:24.8685621Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp45gh_tej/_remote_module_non_scriptable.py 2022-08-17T12:42:25.2904542Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:42:25.2920908Z 2022-08-17T12:42:25.2921092Z Running tests... 2022-08-17T12:42:25.2921538Z ---------------------------------------------------------------------- 2022-08-17T12:42:26.8199749Z test_async_execution_nested_with_cuda_future (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:42:26.8383710Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4377 2022-08-17T12:42:26.8390179Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4378 2022-08-17T12:42:26.8396217Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 4379 2022-08-17T12:42:26.8402679Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 4380 2022-08-17T12:42:28.2267021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:28.2268007Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:28.2269225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:28.2270498Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:28.2283138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:28.2284044Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:28.2285902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:28.2286870Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:28.2307805Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:28.2308702Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:28.2311370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:28.2312327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:28.2485802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:28.2486705Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:28.2488623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:28.2489604Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:28.4033875Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5vy0jhe6 2022-08-17T12:42:28.4035011Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5vy0jhe6/_remote_module_non_scriptable.py 2022-08-17T12:42:28.4067180Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpslw31ze0 2022-08-17T12:42:28.4069450Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpslw31ze0/_remote_module_non_scriptable.py 2022-08-17T12:42:28.4110441Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_5vu1d_z 2022-08-17T12:42:28.4113376Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_5vu1d_z/_remote_module_non_scriptable.py 2022-08-17T12:42:28.4225772Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1mwmeb96 2022-08-17T12:42:28.4229193Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1mwmeb96/_remote_module_non_scriptable.py 2022-08-17T12:42:28.8328133Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:42:28.8342610Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:42:28.8437007Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:42:28.8543913Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:42:28.9353096Z fi_getinfo: -61 2022-08-17T12:42:28.9361792Z fi_getinfo: -61 2022-08-17T12:42:28.9456820Z fi_getinfo: -61 2022-08-17T12:42:28.9561816Z fi_getinfo: -61 2022-08-17T12:42:33.8581361Z ok (8.566s) 2022-08-17T12:42:33.8581571Z 2022-08-17T12:42:33.8581975Z ---------------------------------------------------------------------- 2022-08-17T12:42:33.8582332Z Ran 1 test in 8.566s 2022-08-17T12:42:33.8582485Z 2022-08-17T12:42:33.8582581Z OK 2022-08-17T12:42:33.8582717Z 2022-08-17T12:42:33.8582855Z Generating XML reports... 2022-08-17T12:42:33.8619232Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124225.xml 2022-08-17T12:42:35.6164064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:35.6164596Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:35.6165416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:35.6166184Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:35.7904153Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5zcoz_ol 2022-08-17T12:42:35.7906961Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5zcoz_ol/_remote_module_non_scriptable.py 2022-08-17T12:42:36.2141324Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:42:36.2157707Z 2022-08-17T12:42:36.2158021Z Running tests... 2022-08-17T12:42:36.2158461Z ---------------------------------------------------------------------- 2022-08-17T12:42:37.7261228Z test_async_execution_with_cuda_future (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:42:37.7446025Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4724 2022-08-17T12:42:37.7452164Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4725 2022-08-17T12:42:37.7459076Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 4726 2022-08-17T12:42:37.7465406Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 4727 2022-08-17T12:42:39.1273159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:39.1273977Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:39.1274708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:39.1275190Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:39.1546436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:39.1546933Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:39.1550318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:39.1550800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:39.1839420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:39.1841270Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:39.1842590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:39.1843061Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:39.1972846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:39.1973327Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:39.1976419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:39.1976896Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:39.2953246Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoah49bly 2022-08-17T12:42:39.2955126Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoah49bly/_remote_module_non_scriptable.py 2022-08-17T12:42:39.3283431Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqs0dm13o 2022-08-17T12:42:39.3286050Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqs0dm13o/_remote_module_non_scriptable.py 2022-08-17T12:42:39.3515382Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6zec2dmo 2022-08-17T12:42:39.3516824Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6zec2dmo/_remote_module_non_scriptable.py 2022-08-17T12:42:39.3728848Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9p0kwg_k 2022-08-17T12:42:39.3731111Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9p0kwg_k/_remote_module_non_scriptable.py 2022-08-17T12:42:39.7090880Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:42:39.7687842Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:42:39.7748656Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:42:39.8032891Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:42:39.8113019Z fi_getinfo: -61 2022-08-17T12:42:39.8707199Z fi_getinfo: -61 2022-08-17T12:42:39.8767453Z fi_getinfo: -61 2022-08-17T12:42:39.9050927Z fi_getinfo: -61 2022-08-17T12:42:47.3700766Z ok (11.154s) 2022-08-17T12:42:47.3700978Z 2022-08-17T12:42:47.3701386Z ---------------------------------------------------------------------- 2022-08-17T12:42:47.3701737Z Ran 1 test in 11.154s 2022-08-17T12:42:47.3701910Z 2022-08-17T12:42:47.3702006Z OK 2022-08-17T12:42:47.3702143Z 2022-08-17T12:42:47.3702281Z Generating XML reports... 2022-08-17T12:42:47.3739648Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124236.xml 2022-08-17T12:42:49.1441276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:49.1441812Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:49.1442861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:49.1443361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:49.3200717Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6w58o6bs 2022-08-17T12:42:49.3203186Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6w58o6bs/_remote_module_non_scriptable.py 2022-08-17T12:42:49.7489869Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:42:49.7506506Z 2022-08-17T12:42:49.7507006Z Running tests... 2022-08-17T12:42:49.7507545Z ---------------------------------------------------------------------- 2022-08-17T12:42:51.2669678Z test_cuda_future_callback_changes_devices (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:42:51.2855911Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5071 2022-08-17T12:42:51.2862250Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5072 2022-08-17T12:42:51.2869712Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 5073 2022-08-17T12:42:51.2876147Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 5074 2022-08-17T12:42:52.6909815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:52.6910356Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:52.6910998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:52.6911474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:52.7060129Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:52.7060613Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:52.7063519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:52.7064407Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:52.7104542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:52.7105329Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:52.7107705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:52.7108177Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:52.7383816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:42:52.7384592Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:42:52.7386924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:42:52.7387610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:42:52.8596921Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0dupgv5h 2022-08-17T12:42:52.8598783Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0dupgv5h/_remote_module_non_scriptable.py 2022-08-17T12:42:52.8738017Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfjvzo8q1 2022-08-17T12:42:52.8740576Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfjvzo8q1/_remote_module_non_scriptable.py 2022-08-17T12:42:52.8759232Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7mb_a0s1 2022-08-17T12:42:52.8761747Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7mb_a0s1/_remote_module_non_scriptable.py 2022-08-17T12:42:52.9138961Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr4fzkji5 2022-08-17T12:42:52.9140307Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr4fzkji5/_remote_module_non_scriptable.py 2022-08-17T12:42:53.2869442Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:42:53.2920993Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:42:53.2938962Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:42:53.3424022Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:43:00.3094290Z ok (10.558s) 2022-08-17T12:43:00.3094550Z 2022-08-17T12:43:00.3094954Z ---------------------------------------------------------------------- 2022-08-17T12:43:00.3095325Z Ran 1 test in 10.559s 2022-08-17T12:43:00.3095498Z 2022-08-17T12:43:00.3095598Z OK 2022-08-17T12:43:00.3095739Z 2022-08-17T12:43:00.3095863Z Generating XML reports... 2022-08-17T12:43:00.3133069Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124249.xml 2022-08-17T12:43:02.1029861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:02.1030610Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:02.1031495Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:02.1031975Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:02.2798593Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8qykl8ab 2022-08-17T12:43:02.2801038Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8qykl8ab/_remote_module_non_scriptable.py 2022-08-17T12:43:02.7067626Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:43:02.7084051Z 2022-08-17T12:43:02.7084203Z Running tests... 2022-08-17T12:43:02.7084672Z ---------------------------------------------------------------------- 2022-08-17T12:43:04.2189735Z test_cuda_future_can_extract_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:43:04.2376503Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5250 2022-08-17T12:43:04.2382577Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5251 2022-08-17T12:43:04.2389118Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 5252 2022-08-17T12:43:04.2396307Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 5253 2022-08-17T12:43:05.6335809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:05.6336341Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:05.6336937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:05.6337733Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:05.6348107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:05.6348587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:05.6351964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:05.6352453Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:05.6545152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:05.6545638Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:05.6548304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:05.6548812Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:05.6555575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:05.6556036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:05.6559366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:05.6559855Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:05.8083691Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4kh27cst 2022-08-17T12:43:05.8084753Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4kh27cst/_remote_module_non_scriptable.py 2022-08-17T12:43:05.8107082Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvf657_8f 2022-08-17T12:43:05.8110281Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvf657_8f/_remote_module_non_scriptable.py 2022-08-17T12:43:05.8303117Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo4ob2ef1 2022-08-17T12:43:05.8306121Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo4ob2ef1/_remote_module_non_scriptable.py 2022-08-17T12:43:05.8355617Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5aile7ln 2022-08-17T12:43:05.8358779Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5aile7ln/_remote_module_non_scriptable.py 2022-08-17T12:43:06.2344792Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:43:06.2459577Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:43:06.2499365Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:43:06.2674966Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:43:11.9583667Z ok (9.250s) 2022-08-17T12:43:11.9583881Z 2022-08-17T12:43:11.9584290Z ---------------------------------------------------------------------- 2022-08-17T12:43:11.9584636Z Ran 1 test in 9.250s 2022-08-17T12:43:11.9585174Z 2022-08-17T12:43:11.9585268Z OK 2022-08-17T12:43:11.9585416Z 2022-08-17T12:43:11.9586134Z Generating XML reports... 2022-08-17T12:43:11.9625233Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124302.xml 2022-08-17T12:43:13.7431743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:13.7432260Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:13.7432866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:13.7433356Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:13.9183384Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6pieadf5 2022-08-17T12:43:13.9185768Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6pieadf5/_remote_module_non_scriptable.py 2022-08-17T12:43:14.3466964Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:43:14.3482991Z 2022-08-17T12:43:14.3483235Z Running tests... 2022-08-17T12:43:14.3483677Z ---------------------------------------------------------------------- 2022-08-17T12:43:15.8824158Z test_cuda_future_can_extract_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:43:15.9012204Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5485 2022-08-17T12:43:15.9018038Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5486 2022-08-17T12:43:15.9024953Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 5487 2022-08-17T12:43:15.9031668Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 5488 2022-08-17T12:43:17.2924634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:17.2925591Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:17.2926745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:17.2927667Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:17.3026047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:17.3026950Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:17.3030604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:17.3031579Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:17.3291243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:17.3292184Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:17.3293442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:17.3293957Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:17.3544847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:17.3545331Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:17.3548658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:17.3549162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:17.4634989Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbtvdsf8m 2022-08-17T12:43:17.4636640Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbtvdsf8m/_remote_module_non_scriptable.py 2022-08-17T12:43:17.4749818Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgsjg1v_o 2022-08-17T12:43:17.4752723Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgsjg1v_o/_remote_module_non_scriptable.py 2022-08-17T12:43:17.5047169Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvnibqwo6 2022-08-17T12:43:17.5049725Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvnibqwo6/_remote_module_non_scriptable.py 2022-08-17T12:43:17.5272345Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6gemrfkd 2022-08-17T12:43:17.5274303Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6gemrfkd/_remote_module_non_scriptable.py 2022-08-17T12:43:17.8873722Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:43:17.9045880Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:43:17.9453144Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:43:17.9482905Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:43:23.1208469Z ok (8.772s) 2022-08-17T12:43:23.1208862Z 2022-08-17T12:43:23.1209646Z ---------------------------------------------------------------------- 2022-08-17T12:43:23.1210165Z Ran 1 test in 8.772s 2022-08-17T12:43:23.1210339Z 2022-08-17T12:43:23.1210432Z OK 2022-08-17T12:43:23.1210551Z 2022-08-17T12:43:23.1210690Z Generating XML reports... 2022-08-17T12:43:23.1246814Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124314.xml 2022-08-17T12:43:24.8794925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:24.8795468Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:24.8796463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:24.8796963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:25.0598871Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8q7ilr6n 2022-08-17T12:43:25.0600775Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8q7ilr6n/_remote_module_non_scriptable.py 2022-08-17T12:43:25.4909078Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:43:25.4925516Z 2022-08-17T12:43:25.4925819Z Running tests... 2022-08-17T12:43:25.4926259Z ---------------------------------------------------------------------- 2022-08-17T12:43:26.9907985Z test_cuda_future_can_extract_custom_class_with_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:43:27.0088774Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5660 2022-08-17T12:43:27.0095426Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5661 2022-08-17T12:43:27.0101362Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 5662 2022-08-17T12:43:27.0108028Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 5663 2022-08-17T12:43:28.4083620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:28.4084174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:28.4084782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:28.4085298Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:28.4094908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:28.4095411Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:28.4098005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:28.4098500Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:28.4130913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:28.4131393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:28.4134549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:28.4135196Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:28.4275490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:28.4275966Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:28.4278832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:28.4279322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:28.5837982Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt3uxp1zl 2022-08-17T12:43:28.5838838Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt3uxp1zl/_remote_module_non_scriptable.py 2022-08-17T12:43:28.5885308Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_dx3ro8g 2022-08-17T12:43:28.5887964Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_dx3ro8g/_remote_module_non_scriptable.py 2022-08-17T12:43:28.5890919Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprj8w345r 2022-08-17T12:43:28.5893976Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprj8w345r/_remote_module_non_scriptable.py 2022-08-17T12:43:28.6016595Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe1leq8vi 2022-08-17T12:43:28.6019465Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe1leq8vi/_remote_module_non_scriptable.py 2022-08-17T12:43:29.0118742Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:43:29.0195703Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:43:29.0205229Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:43:29.0401877Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:43:34.7295208Z ok (9.237s) 2022-08-17T12:43:34.7295414Z 2022-08-17T12:43:34.7295827Z ---------------------------------------------------------------------- 2022-08-17T12:43:34.7296173Z Ran 1 test in 9.237s 2022-08-17T12:43:34.7296340Z 2022-08-17T12:43:34.7296433Z OK 2022-08-17T12:43:34.7296572Z 2022-08-17T12:43:34.7296707Z Generating XML reports... 2022-08-17T12:43:34.7334179Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124325.xml 2022-08-17T12:43:36.4976896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:36.4977412Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:36.4978405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:36.4978904Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:36.6726226Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdh5niygt 2022-08-17T12:43:36.6728760Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdh5niygt/_remote_module_non_scriptable.py 2022-08-17T12:43:37.0936831Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:43:37.0953286Z 2022-08-17T12:43:37.0953737Z Running tests... 2022-08-17T12:43:37.0954229Z ---------------------------------------------------------------------- 2022-08-17T12:43:38.6047763Z test_cuda_future_can_extract_custom_class_with_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:43:38.6232761Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5899 2022-08-17T12:43:38.6239059Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5900 2022-08-17T12:43:38.6245189Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 5901 2022-08-17T12:43:38.6251428Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 5902 2022-08-17T12:43:40.0226454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:40.0227011Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:40.0228227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:40.0228721Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:40.0232298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:40.0232779Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:40.0233550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:40.0234024Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:40.0235458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:40.0235945Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:40.0237034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:40.0237567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:40.0384175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:40.0384714Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:40.0391915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:40.0392436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:40.2043793Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2sncd8y4 2022-08-17T12:43:40.2044405Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcmthkwq6 2022-08-17T12:43:40.2045185Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcmthkwq6/_remote_module_non_scriptable.py 2022-08-17T12:43:40.2045763Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2sncd8y4/_remote_module_non_scriptable.py 2022-08-17T12:43:40.2069502Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp940n8nmp 2022-08-17T12:43:40.2072208Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp940n8nmp/_remote_module_non_scriptable.py 2022-08-17T12:43:40.2145918Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdte6uqyz 2022-08-17T12:43:40.2148881Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdte6uqyz/_remote_module_non_scriptable.py 2022-08-17T12:43:40.6313565Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:43:40.6367769Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:43:40.6382369Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:43:40.6547072Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:43:46.4439875Z ok (9.348s) 2022-08-17T12:43:46.4440099Z 2022-08-17T12:43:46.4440480Z ---------------------------------------------------------------------- 2022-08-17T12:43:46.4440828Z Ran 1 test in 9.349s 2022-08-17T12:43:46.4440997Z 2022-08-17T12:43:46.4441089Z OK 2022-08-17T12:43:46.4441230Z 2022-08-17T12:43:46.4441366Z Generating XML reports... 2022-08-17T12:43:46.4478523Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124337.xml 2022-08-17T12:43:48.2195638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:48.2196522Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:48.2197423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:48.2197903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:48.3927893Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxdm8sgv7 2022-08-17T12:43:48.3930396Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxdm8sgv7/_remote_module_non_scriptable.py 2022-08-17T12:43:48.8150853Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:43:48.8166643Z 2022-08-17T12:43:48.8167021Z Running tests... 2022-08-17T12:43:48.8167488Z ---------------------------------------------------------------------- 2022-08-17T12:43:50.3320075Z test_cuda_future_can_extract_list_with_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:43:50.3502461Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6078 2022-08-17T12:43:50.3508521Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6079 2022-08-17T12:43:50.3514771Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 6080 2022-08-17T12:43:50.3521001Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 6081 2022-08-17T12:43:51.7385749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:51.7386735Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:51.7387926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:51.7388861Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:51.7993960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:51.7994631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:51.7995595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:51.7996175Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:51.8074381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:51.8074973Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:51.8077265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:51.8077983Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:51.8491024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:51.8491729Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:51.8492513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:51.8492989Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:43:51.9069646Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpea_3rka3 2022-08-17T12:43:51.9071180Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpea_3rka3/_remote_module_non_scriptable.py 2022-08-17T12:43:51.9730078Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2l0yzzs1 2022-08-17T12:43:51.9731282Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2l0yzzs1/_remote_module_non_scriptable.py 2022-08-17T12:43:51.9774839Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjk_oqig2 2022-08-17T12:43:51.9777570Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjk_oqig2/_remote_module_non_scriptable.py 2022-08-17T12:43:52.0241720Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp617h4n_z 2022-08-17T12:43:52.0242732Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp617h4n_z/_remote_module_non_scriptable.py 2022-08-17T12:43:52.3215255Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:43:52.3961311Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:43:52.4072100Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:43:52.4527741Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:43:58.0710280Z ok (9.254s) 2022-08-17T12:43:58.0710642Z 2022-08-17T12:43:58.0711302Z ---------------------------------------------------------------------- 2022-08-17T12:43:58.0711926Z Ran 1 test in 9.254s 2022-08-17T12:43:58.0712221Z 2022-08-17T12:43:58.0712384Z OK 2022-08-17T12:43:58.0712598Z 2022-08-17T12:43:58.0712836Z Generating XML reports... 2022-08-17T12:43:58.0749315Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124348.xml 2022-08-17T12:43:59.8442411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:43:59.8442934Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:43:59.8443731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:43:59.8444249Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:00.0182711Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_cxsr_d9 2022-08-17T12:44:00.0185097Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_cxsr_d9/_remote_module_non_scriptable.py 2022-08-17T12:44:00.4494709Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:44:00.4510867Z 2022-08-17T12:44:00.4511290Z Running tests... 2022-08-17T12:44:00.4511774Z ---------------------------------------------------------------------- 2022-08-17T12:44:01.9618796Z test_cuda_future_can_extract_list_with_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:44:01.9803742Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6313 2022-08-17T12:44:01.9809365Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6314 2022-08-17T12:44:01.9815554Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 6315 2022-08-17T12:44:01.9822325Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 6316 2022-08-17T12:44:03.3699252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:03.3700227Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:03.3701427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:03.3702338Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:03.3703783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:03.3704695Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:03.3706318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:03.3706845Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:03.4262847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:03.4263369Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:03.4266405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:03.4266909Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:03.4329679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:03.4330164Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:03.4332896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:03.4333401Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:03.5443724Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxurq9msw 2022-08-17T12:44:03.5445034Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxurq9msw/_remote_module_non_scriptable.py 2022-08-17T12:44:03.5460729Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpydgx97j0 2022-08-17T12:44:03.5465015Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpydgx97j0/_remote_module_non_scriptable.py 2022-08-17T12:44:03.5929448Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmjzzas8p 2022-08-17T12:44:03.5931353Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmjzzas8p/_remote_module_non_scriptable.py 2022-08-17T12:44:03.6004376Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv186jn7w 2022-08-17T12:44:03.6006904Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv186jn7w/_remote_module_non_scriptable.py 2022-08-17T12:44:03.9698595Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:44:04.0029105Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:44:04.0185366Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:44:04.0236051Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:44:09.3003552Z ok (8.849s) 2022-08-17T12:44:09.3003799Z 2022-08-17T12:44:09.3004378Z ---------------------------------------------------------------------- 2022-08-17T12:44:09.3004735Z Ran 1 test in 8.849s 2022-08-17T12:44:09.3004917Z 2022-08-17T12:44:09.3005010Z OK 2022-08-17T12:44:09.3005150Z 2022-08-17T12:44:09.3005292Z Generating XML reports... 2022-08-17T12:44:09.3043479Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124400.xml 2022-08-17T12:44:11.0705462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:11.0706019Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:11.0706871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:11.0707356Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:11.2452388Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc6b4y8kp 2022-08-17T12:44:11.2454220Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc6b4y8kp/_remote_module_non_scriptable.py 2022-08-17T12:44:11.6689454Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:44:11.6705012Z 2022-08-17T12:44:11.6705269Z Running tests... 2022-08-17T12:44:11.6705702Z ---------------------------------------------------------------------- 2022-08-17T12:44:13.1572482Z test_cuda_future_device_as_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:44:13.1748136Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6488 2022-08-17T12:44:13.1754543Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6489 2022-08-17T12:44:13.1760501Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 6490 2022-08-17T12:44:13.1766703Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 6491 2022-08-17T12:44:14.5656672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:14.5657602Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:14.5658755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:14.5659654Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:14.5664000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:14.5665181Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:14.5667199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:14.5668165Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:14.5716783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:14.5717670Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:14.5720788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:14.5721770Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:14.5875761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:14.5876689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:14.5878763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:14.5879710Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:14.7413845Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2el3pbi5 2022-08-17T12:44:14.7414971Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2el3pbi5/_remote_module_non_scriptable.py 2022-08-17T12:44:14.7426177Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpob1_359b 2022-08-17T12:44:14.7428983Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpob1_359b/_remote_module_non_scriptable.py 2022-08-17T12:44:14.7453632Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7irmjwpw 2022-08-17T12:44:14.7455875Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7irmjwpw/_remote_module_non_scriptable.py 2022-08-17T12:44:14.7620484Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsfu49gr4 2022-08-17T12:44:14.7623313Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsfu49gr4/_remote_module_non_scriptable.py 2022-08-17T12:44:15.1717424Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:44:15.1734677Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:44:15.1737805Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:44:15.1940964Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:44:15.5837275Z ok (3.913s) 2022-08-17T12:44:15.5837532Z 2022-08-17T12:44:15.5837956Z ---------------------------------------------------------------------- 2022-08-17T12:44:15.5838308Z Ran 1 test in 3.913s 2022-08-17T12:44:15.5838478Z 2022-08-17T12:44:15.5838569Z OK 2022-08-17T12:44:15.5839427Z 2022-08-17T12:44:15.5839694Z Generating XML reports... 2022-08-17T12:44:15.5874399Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124411.xml 2022-08-17T12:44:17.3416308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:17.3416842Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:17.3418357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:17.3419197Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:17.5161160Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphc5_4o18 2022-08-17T12:44:17.5163838Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphc5_4o18/_remote_module_non_scriptable.py 2022-08-17T12:44:17.9417416Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:44:17.9433895Z 2022-08-17T12:44:17.9434205Z Running tests... 2022-08-17T12:44:17.9435106Z ---------------------------------------------------------------------- 2022-08-17T12:44:19.4446811Z test_cuda_future_device_as_int (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:44:19.4631328Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6659 2022-08-17T12:44:19.4637936Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6660 2022-08-17T12:44:19.4644781Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 6661 2022-08-17T12:44:19.4651492Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 6662 2022-08-17T12:44:20.8426125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:20.8427088Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:20.8428277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:20.8429209Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:20.8564732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:20.8565626Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:20.8567192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:20.8568177Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:20.8667271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:20.8668179Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:20.8670709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:20.8671663Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:20.9448692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:20.9449620Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:20.9450812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:20.9452144Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:21.0112590Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpot1928l5 2022-08-17T12:44:21.0114385Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpot1928l5/_remote_module_non_scriptable.py 2022-08-17T12:44:21.0236007Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplr5ujsrl 2022-08-17T12:44:21.0238039Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplr5ujsrl/_remote_module_non_scriptable.py 2022-08-17T12:44:21.0346216Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx7h04aqy 2022-08-17T12:44:21.0348912Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx7h04aqy/_remote_module_non_scriptable.py 2022-08-17T12:44:21.1226564Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvtkmztid 2022-08-17T12:44:21.1228241Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvtkmztid/_remote_module_non_scriptable.py 2022-08-17T12:44:21.4384669Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:44:21.4431931Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:44:21.4566644Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:44:21.5525921Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:44:21.9725037Z ok (4.029s) 2022-08-17T12:44:21.9725246Z 2022-08-17T12:44:21.9725640Z ---------------------------------------------------------------------- 2022-08-17T12:44:21.9725986Z Ran 1 test in 4.029s 2022-08-17T12:44:21.9726155Z 2022-08-17T12:44:21.9726249Z OK 2022-08-17T12:44:21.9726384Z 2022-08-17T12:44:21.9726507Z Generating XML reports... 2022-08-17T12:44:21.9764797Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124417.xml 2022-08-17T12:44:23.7091058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:23.7092091Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:23.7093322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:23.7094262Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:23.8831288Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv7pfnn3_ 2022-08-17T12:44:23.8833302Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv7pfnn3_/_remote_module_non_scriptable.py 2022-08-17T12:44:24.3128196Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:44:24.3145264Z 2022-08-17T12:44:24.3145714Z Running tests... 2022-08-17T12:44:24.3146208Z ---------------------------------------------------------------------- 2022-08-17T12:44:25.8090371Z test_cuda_future_device_as_str (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:44:25.8275566Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6830 2022-08-17T12:44:25.8282384Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6831 2022-08-17T12:44:25.8289282Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 6832 2022-08-17T12:44:25.8295865Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 6833 2022-08-17T12:44:27.2183071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:27.2184038Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:27.2185020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:27.2185512Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:27.2239412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:27.2239898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:27.2242621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:27.2243136Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:27.2469127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:27.2469667Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:27.2473255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:27.2473769Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:27.2524357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:27.2524915Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:27.2527601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:27.2528104Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:27.3858486Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7v8lpg2v 2022-08-17T12:44:27.3859811Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7v8lpg2v/_remote_module_non_scriptable.py 2022-08-17T12:44:27.3919285Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdwbl7keb 2022-08-17T12:44:27.3921992Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdwbl7keb/_remote_module_non_scriptable.py 2022-08-17T12:44:27.4227002Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl0dvt8xy 2022-08-17T12:44:27.4229792Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl0dvt8xy/_remote_module_non_scriptable.py 2022-08-17T12:44:27.4234325Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4ekqksba 2022-08-17T12:44:27.4237692Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4ekqksba/_remote_module_non_scriptable.py 2022-08-17T12:44:27.8086801Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:44:27.8195529Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:44:27.8483456Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:44:27.8583939Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:44:28.2368563Z ok (3.922s) 2022-08-17T12:44:28.2368856Z 2022-08-17T12:44:28.2369574Z ---------------------------------------------------------------------- 2022-08-17T12:44:28.2369938Z Ran 1 test in 3.922s 2022-08-17T12:44:28.2370107Z 2022-08-17T12:44:28.2370201Z OK 2022-08-17T12:44:28.2370341Z 2022-08-17T12:44:28.2370461Z Generating XML reports... 2022-08-17T12:44:28.2407229Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124424.xml 2022-08-17T12:44:29.9931181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:29.9931716Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:29.9932558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:29.9933443Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:30.1667049Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy_jou61s 2022-08-17T12:44:30.1669469Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy_jou61s/_remote_module_non_scriptable.py 2022-08-17T12:44:30.5800218Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:44:30.5816108Z 2022-08-17T12:44:30.5816634Z Running tests... 2022-08-17T12:44:30.5817128Z ---------------------------------------------------------------------- 2022-08-17T12:44:32.0739330Z test_cuda_future_device_not_cuda (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:44:32.0920810Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7001 2022-08-17T12:44:32.0926654Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7002 2022-08-17T12:44:32.0933085Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 7003 2022-08-17T12:44:32.0939243Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 7004 2022-08-17T12:44:33.5684062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:33.5685034Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:33.5686221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:33.5687107Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:33.5962881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:33.5963828Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:33.5965824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:33.5966764Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:33.6211508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:33.6212453Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:33.6214364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:33.6215292Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:33.6241775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:33.6242725Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:33.6244464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:33.6245396Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:33.7400765Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2l73hw_s 2022-08-17T12:44:33.7401889Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2l73hw_s/_remote_module_non_scriptable.py 2022-08-17T12:44:33.7691700Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5tend6wy 2022-08-17T12:44:33.7693923Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5tend6wy/_remote_module_non_scriptable.py 2022-08-17T12:44:33.7941756Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqqog1888 2022-08-17T12:44:33.7942735Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqqog1888/_remote_module_non_scriptable.py 2022-08-17T12:44:33.7984881Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpukis39bs 2022-08-17T12:44:33.7988529Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpukis39bs/_remote_module_non_scriptable.py 2022-08-17T12:44:34.1660780Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:44:34.2045689Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:44:34.2173401Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:44:34.2346639Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:44:34.6013917Z ok (4.019s) 2022-08-17T12:44:34.6014151Z 2022-08-17T12:44:34.6014538Z ---------------------------------------------------------------------- 2022-08-17T12:44:34.6014868Z Ran 1 test in 4.020s 2022-08-17T12:44:34.6015035Z 2022-08-17T12:44:34.6015129Z OK 2022-08-17T12:44:34.6015265Z 2022-08-17T12:44:34.6015408Z Generating XML reports... 2022-08-17T12:44:34.6052579Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124430.xml 2022-08-17T12:44:36.3751322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:36.3751839Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:36.3752978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:36.3753464Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:36.5483389Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfor9jp4q 2022-08-17T12:44:36.5486155Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfor9jp4q/_remote_module_non_scriptable.py 2022-08-17T12:44:36.9741290Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:44:36.9757294Z 2022-08-17T12:44:36.9757766Z Running tests... 2022-08-17T12:44:36.9758247Z ---------------------------------------------------------------------- 2022-08-17T12:44:38.4851661Z test_cuda_future_modify_tensor_inplace (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:44:38.5036200Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7172 2022-08-17T12:44:38.5042643Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7173 2022-08-17T12:44:38.5048958Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 7174 2022-08-17T12:44:38.5055398Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 7175 2022-08-17T12:44:39.8627286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:39.8628254Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:39.8629439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:39.8630351Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:39.8821166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:39.8822113Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:39.8823626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:39.8824584Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:39.8889173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:39.8890096Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:39.8891680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:39.8892923Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:39.9152032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:39.9152973Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:39.9154832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:39.9155800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:40.0318977Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr3mwd7d9 2022-08-17T12:44:40.0320059Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr3mwd7d9/_remote_module_non_scriptable.py 2022-08-17T12:44:40.0481502Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxnetvsvl 2022-08-17T12:44:40.0483288Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxnetvsvl/_remote_module_non_scriptable.py 2022-08-17T12:44:40.0562963Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9nu_b8t5 2022-08-17T12:44:40.0565004Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9nu_b8t5/_remote_module_non_scriptable.py 2022-08-17T12:44:40.0887965Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_abw5c24 2022-08-17T12:44:40.0889305Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_abw5c24/_remote_module_non_scriptable.py 2022-08-17T12:44:40.4596546Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:44:40.4648488Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:44:40.4794025Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:44:40.5184256Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:44:42.4178163Z ok (5.442s) 2022-08-17T12:44:42.4178372Z 2022-08-17T12:44:42.4178781Z ---------------------------------------------------------------------- 2022-08-17T12:44:42.4179143Z Ran 1 test in 5.442s 2022-08-17T12:44:42.4179297Z 2022-08-17T12:44:42.4179389Z OK 2022-08-17T12:44:42.4179523Z 2022-08-17T12:44:42.4179658Z Generating XML reports... 2022-08-17T12:44:42.4215410Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124436.xml 2022-08-17T12:44:44.1581751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:44.1582404Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:44.1583037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:44.1584022Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:44.3300441Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnr4uy294 2022-08-17T12:44:44.3302685Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnr4uy294/_remote_module_non_scriptable.py 2022-08-17T12:44:44.7410201Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:44:44.7425194Z 2022-08-17T12:44:44.7425484Z Running tests... 2022-08-17T12:44:44.7425922Z ---------------------------------------------------------------------- 2022-08-17T12:44:46.2171118Z test_cuda_future_replace_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:44:46.2347712Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7347 2022-08-17T12:44:46.2353923Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7348 2022-08-17T12:44:46.2360106Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 7349 2022-08-17T12:44:46.2366164Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 7350 2022-08-17T12:44:47.6312692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:47.6313219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:47.6314312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:47.6314816Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:47.6349341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:47.6349820Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:47.6352715Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:47.6353229Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:47.6447157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:47.6447634Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:47.6450127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:47.6450623Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:47.6505316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:47.6505796Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:47.6508933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:47.6509438Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:47.7995417Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa9_loqo6 2022-08-17T12:44:47.7996867Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa9_loqo6/_remote_module_non_scriptable.py 2022-08-17T12:44:47.8053027Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9a7_m07t 2022-08-17T12:44:47.8055711Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9a7_m07t/_remote_module_non_scriptable.py 2022-08-17T12:44:47.8161014Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6fyaq1yl 2022-08-17T12:44:47.8163638Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6fyaq1yl/_remote_module_non_scriptable.py 2022-08-17T12:44:47.8246568Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp60gsre5x 2022-08-17T12:44:47.8249552Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp60gsre5x/_remote_module_non_scriptable.py 2022-08-17T12:44:48.2257319Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:44:48.2370892Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:44:48.2473776Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:44:48.2563239Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:44:50.2475768Z ok (5.505s) 2022-08-17T12:44:50.2476093Z 2022-08-17T12:44:50.2476483Z ---------------------------------------------------------------------- 2022-08-17T12:44:50.2476831Z Ran 1 test in 5.505s 2022-08-17T12:44:50.2477003Z 2022-08-17T12:44:50.2477101Z OK 2022-08-17T12:44:50.2477243Z 2022-08-17T12:44:50.2477379Z Generating XML reports... 2022-08-17T12:44:50.2512961Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124444.xml 2022-08-17T12:44:52.0140859Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:52.0141434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:52.0142711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:52.0143213Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:52.1881861Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6lt43w9v 2022-08-17T12:44:52.1884748Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6lt43w9v/_remote_module_non_scriptable.py 2022-08-17T12:44:52.6108399Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:44:52.6124574Z 2022-08-17T12:44:52.6124929Z Running tests... 2022-08-17T12:44:52.6125367Z ---------------------------------------------------------------------- 2022-08-17T12:44:54.1328999Z test_cuda_future_value_on_bad_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:44:54.1512478Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7522 2022-08-17T12:44:54.1518894Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7523 2022-08-17T12:44:54.1525481Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 7524 2022-08-17T12:44:54.1531917Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 7525 2022-08-17T12:44:55.5434210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:55.5435189Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:55.5436386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:55.5437323Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:55.5470680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:55.5471601Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:55.5473535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:55.5474485Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:55.5917830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:55.5918758Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:55.5920452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:55.5921405Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:55.6111395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:44:55.6112326Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:44:55.6113827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:44:55.6114801Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:44:55.7144676Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpli22k7gk 2022-08-17T12:44:55.7146077Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpli22k7gk/_remote_module_non_scriptable.py 2022-08-17T12:44:55.7147330Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7jruruzb 2022-08-17T12:44:55.7150735Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7jruruzb/_remote_module_non_scriptable.py 2022-08-17T12:44:55.7625781Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph4tw_p4k 2022-08-17T12:44:55.7626857Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph4tw_p4k/_remote_module_non_scriptable.py 2022-08-17T12:44:55.7870109Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo0fi8vjs 2022-08-17T12:44:55.7871310Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo0fi8vjs/_remote_module_non_scriptable.py 2022-08-17T12:44:56.1368088Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:44:56.1389252Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:44:56.1871597Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:44:56.2156326Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:45:02.1734454Z ok (9.561s) 2022-08-17T12:45:02.1734671Z 2022-08-17T12:45:02.1735107Z ---------------------------------------------------------------------- 2022-08-17T12:45:02.1735436Z Ran 1 test in 9.561s 2022-08-17T12:45:02.1735614Z 2022-08-17T12:45:02.1735708Z OK 2022-08-17T12:45:02.1735859Z 2022-08-17T12:45:02.1735997Z Generating XML reports... 2022-08-17T12:45:02.1773173Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124452.xml 2022-08-17T12:45:03.9532260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:45:03.9532777Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:45:03.9533838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:45:03.9534366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:45:04.1262832Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp86wmcumu 2022-08-17T12:45:04.1265398Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp86wmcumu/_remote_module_non_scriptable.py 2022-08-17T12:45:04.5511216Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:45:04.5526984Z 2022-08-17T12:45:04.5527420Z Running tests... 2022-08-17T12:45:04.5527924Z ---------------------------------------------------------------------- 2022-08-17T12:45:06.0610630Z test_custom_stream (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:45:06.0770529Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/79750 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.524s) 2022-08-17T12:45:06.0771448Z 2022-08-17T12:45:06.0772028Z ---------------------------------------------------------------------- 2022-08-17T12:45:06.0772371Z Ran 1 test in 1.524s 2022-08-17T12:45:06.0772544Z 2022-08-17T12:45:06.0772657Z OK (skipped=1) 2022-08-17T12:45:06.0772820Z 2022-08-17T12:45:06.0773808Z Generating XML reports... 2022-08-17T12:45:06.0805161Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124504.xml 2022-08-17T12:45:07.8292616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:45:07.8293139Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:45:07.8293797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:45:07.8294527Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:45:08.0026864Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn2nrsiad 2022-08-17T12:45:08.0029396Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn2nrsiad/_remote_module_non_scriptable.py 2022-08-17T12:45:08.4272252Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:45:08.4288140Z 2022-08-17T12:45:08.4288339Z Running tests... 2022-08-17T12:45:08.4288766Z ---------------------------------------------------------------------- 2022-08-17T12:45:09.9389030Z test_custom_stream_multi (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:45:09.9572463Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7735 2022-08-17T12:45:09.9578820Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7736 2022-08-17T12:45:09.9585015Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 7737 2022-08-17T12:45:09.9591947Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 7738 2022-08-17T12:45:11.3378719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:45:11.3379250Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:45:11.3379863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:45:11.3380367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:45:11.3556701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:45:11.3557180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:45:11.3560010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:45:11.3560530Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:45:11.3711824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:45:11.3712305Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:45:11.3715118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:45:11.3715625Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:45:11.3875014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:45:11.3875498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:45:11.3878728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:45:11.3879243Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:45:11.5058885Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnh4sk8ks 2022-08-17T12:45:11.5059955Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnh4sk8ks/_remote_module_non_scriptable.py 2022-08-17T12:45:11.5258031Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcgiqwaxb 2022-08-17T12:45:11.5260656Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcgiqwaxb/_remote_module_non_scriptable.py 2022-08-17T12:45:11.5457072Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjvy4a_ma 2022-08-17T12:45:11.5459784Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjvy4a_ma/_remote_module_non_scriptable.py 2022-08-17T12:45:11.5568704Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6njgpl75 2022-08-17T12:45:11.5571161Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6njgpl75/_remote_module_non_scriptable.py 2022-08-17T12:45:11.9291470Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:45:11.9555984Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:45:11.9776085Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:45:11.9821550Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:45:12.0312578Z fi_getinfo: -61 2022-08-17T12:45:12.0575787Z fi_getinfo: -61 2022-08-17T12:45:12.0798610Z fi_getinfo: -61 2022-08-17T12:45:12.0840822Z fi_getinfo: -61 2022-08-17T12:45:26.4996307Z ok (18.070s) 2022-08-17T12:45:26.4996533Z 2022-08-17T12:45:26.4996934Z ---------------------------------------------------------------------- 2022-08-17T12:45:26.4997268Z Ran 1 test in 18.071s 2022-08-17T12:45:26.4997457Z 2022-08-17T12:45:26.4997553Z OK 2022-08-17T12:45:26.4997695Z 2022-08-17T12:45:26.4997832Z Generating XML reports... 2022-08-17T12:45:26.5035035Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124508.xml 2022-08-17T12:45:28.2881524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:45:28.2882056Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:45:28.2883337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:45:28.2883842Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:45:28.4614227Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprpyuzlfc 2022-08-17T12:45:28.4616664Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprpyuzlfc/_remote_module_non_scriptable.py 2022-08-17T12:45:28.8889940Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:45:28.8906203Z 2022-08-17T12:45:28.8906546Z Running tests... 2022-08-17T12:45:28.8907032Z ---------------------------------------------------------------------- 2022-08-17T12:45:30.3922558Z test_custom_stream_nested (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:45:30.4105856Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8094 2022-08-17T12:45:30.4112313Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8095 2022-08-17T12:45:30.4118509Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 8096 2022-08-17T12:45:30.4124918Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 8097 2022-08-17T12:45:31.8293113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:45:31.8293675Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:45:31.8295069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:45:31.8295627Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:45:31.8538748Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:45:31.8539578Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:45:31.8540436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:45:31.8540933Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:45:31.8594335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:45:31.8595125Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:45:31.8597574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:45:31.8598085Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:45:31.8872124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:45:31.8872600Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:45:31.8875751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:45:31.8876243Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:45:32.0052824Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvrfhozfh 2022-08-17T12:45:32.0054063Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvrfhozfh/_remote_module_non_scriptable.py 2022-08-17T12:45:32.0208605Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbxj7rvf2 2022-08-17T12:45:32.0210760Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbxj7rvf2/_remote_module_non_scriptable.py 2022-08-17T12:45:32.0257724Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmykv67ra 2022-08-17T12:45:32.0259732Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmykv67ra/_remote_module_non_scriptable.py 2022-08-17T12:45:32.0546415Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv1t9ssuo 2022-08-17T12:45:32.0547836Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv1t9ssuo/_remote_module_non_scriptable.py 2022-08-17T12:45:32.4454283Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:45:32.4494149Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:45:32.4498954Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:45:32.4704051Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:45:32.5473012Z fi_getinfo: -61 2022-08-17T12:45:32.5512580Z fi_getinfo: -61 2022-08-17T12:45:32.5516752Z fi_getinfo: -61 2022-08-17T12:45:32.5721443Z fi_getinfo: -61 2022-08-17T12:45:41.6404871Z ok (12.750s) 2022-08-17T12:45:41.6405319Z 2022-08-17T12:45:41.6406068Z ---------------------------------------------------------------------- 2022-08-17T12:45:41.6406525Z Ran 1 test in 12.750s 2022-08-17T12:45:41.6406701Z 2022-08-17T12:45:41.6406799Z OK 2022-08-17T12:45:41.6409914Z 2022-08-17T12:45:41.6410306Z Generating XML reports... 2022-08-17T12:45:41.6442568Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124528.xml 2022-08-17T12:45:43.4178954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:45:43.4179482Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:45:43.4180906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:45:43.4181426Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:45:43.5920227Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt5hiyvm_ 2022-08-17T12:45:43.5922592Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt5hiyvm_/_remote_module_non_scriptable.py 2022-08-17T12:45:44.0156761Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:45:44.0172677Z 2022-08-17T12:45:44.0173142Z Running tests... 2022-08-17T12:45:44.0173654Z ---------------------------------------------------------------------- 2022-08-17T12:45:45.5210322Z test_custom_stream_nested_multi (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:45:45.5394157Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8453 2022-08-17T12:45:45.5400409Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8454 2022-08-17T12:45:45.5406862Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 8455 2022-08-17T12:45:45.5413196Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 8456 2022-08-17T12:45:46.9182736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:45:46.9184084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:45:46.9185276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:45:46.9186235Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:45:46.9724240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:45:46.9725243Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:45:46.9726399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:45:46.9727266Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:45:46.9728406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:45:46.9729350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:45:46.9730531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:45:46.9731447Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:45:46.9892974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:45:46.9893922Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:45:46.9895992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:45:46.9896925Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:45:47.0880546Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcaj7_3jj 2022-08-17T12:45:47.0881955Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcaj7_3jj/_remote_module_non_scriptable.py 2022-08-17T12:45:47.1466077Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_kauuiht 2022-08-17T12:45:47.1467601Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_kauuiht/_remote_module_non_scriptable.py 2022-08-17T12:45:47.1503538Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3d6f66pi 2022-08-17T12:45:47.1507765Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3d6f66pi/_remote_module_non_scriptable.py 2022-08-17T12:45:47.1562340Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi9caii85 2022-08-17T12:45:47.1564463Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi9caii85/_remote_module_non_scriptable.py 2022-08-17T12:45:47.5005116Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:45:47.5775217Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:45:47.5809784Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:45:47.5847352Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:45:47.6026507Z fi_getinfo: -61 2022-08-17T12:45:47.6795453Z fi_getinfo: -61 2022-08-17T12:45:47.6830796Z fi_getinfo: -61 2022-08-17T12:45:47.6864994Z fi_getinfo: -61 2022-08-17T12:45:55.3658645Z ok (11.348s) 2022-08-17T12:45:55.3659073Z 2022-08-17T12:45:55.3659819Z ---------------------------------------------------------------------- 2022-08-17T12:45:55.3660159Z Ran 1 test in 11.348s 2022-08-17T12:45:55.3660334Z 2022-08-17T12:45:55.3660430Z OK 2022-08-17T12:45:55.3660573Z 2022-08-17T12:45:55.3662747Z Generating XML reports... 2022-08-17T12:45:55.3695869Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124544.xml 2022-08-17T12:45:57.1700406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:45:57.1700931Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:45:57.1701727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:45:57.1702230Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:45:57.3434457Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9l0p4g22 2022-08-17T12:45:57.3437063Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9l0p4g22/_remote_module_non_scriptable.py 2022-08-17T12:45:57.7661302Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:45:57.7677373Z 2022-08-17T12:45:57.7677784Z Running tests... 2022-08-17T12:45:57.7678269Z ---------------------------------------------------------------------- 2022-08-17T12:45:59.2791015Z test_device_map_cpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:45:59.2968487Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8807 2022-08-17T12:45:59.2975097Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8808 2022-08-17T12:45:59.2980749Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 8809 2022-08-17T12:45:59.2987532Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 8810 2022-08-17T12:46:00.6827670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:00.6828202Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:00.6829142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:00.6829620Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:00.6960405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:00.6960877Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:00.6963513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:00.6964002Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:00.7214166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:00.7214659Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:00.7217090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:00.7217569Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:00.7252142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:00.7252608Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:00.7255508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:00.7256131Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:00.8511593Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo8q5zkc_ 2022-08-17T12:46:00.8513224Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo8q5zkc_/_remote_module_non_scriptable.py 2022-08-17T12:46:00.8641679Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpshp96nsc 2022-08-17T12:46:00.8644502Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpshp96nsc/_remote_module_non_scriptable.py 2022-08-17T12:46:00.8958303Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2rj50_s7 2022-08-17T12:46:00.8959165Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpghii09xi 2022-08-17T12:46:00.8960667Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2rj50_s7/_remote_module_non_scriptable.py 2022-08-17T12:46:00.8961889Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpghii09xi/_remote_module_non_scriptable.py 2022-08-17T12:46:01.2714419Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:46:01.2839646Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:46:01.3237221Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:46:01.3306504Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:46:01.3734879Z fi_getinfo: -61 2022-08-17T12:46:01.3860590Z fi_getinfo: -61 2022-08-17T12:46:01.4256770Z fi_getinfo: -61 2022-08-17T12:46:01.4325242Z fi_getinfo: -61 2022-08-17T12:46:02.3075023Z ok (4.539s) 2022-08-17T12:46:02.3075213Z 2022-08-17T12:46:02.3075621Z ---------------------------------------------------------------------- 2022-08-17T12:46:02.3075971Z Ran 1 test in 4.540s 2022-08-17T12:46:02.3076167Z 2022-08-17T12:46:02.3076265Z OK 2022-08-17T12:46:02.3076402Z 2022-08-17T12:46:02.3076531Z Generating XML reports... 2022-08-17T12:46:02.3112709Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124557.xml 2022-08-17T12:46:04.0603481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:04.0604002Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:04.0605023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:04.0605503Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:04.2284524Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf82mfnou 2022-08-17T12:46:04.2286980Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf82mfnou/_remote_module_non_scriptable.py 2022-08-17T12:46:04.6460027Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:46:04.6475210Z 2022-08-17T12:46:04.6475413Z Running tests... 2022-08-17T12:46:04.6476124Z ---------------------------------------------------------------------- 2022-08-17T12:46:06.1303844Z test_device_map_cpu_to_gpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:46:06.1481050Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9150 2022-08-17T12:46:06.1487164Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9151 2022-08-17T12:46:06.1493299Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 9152 2022-08-17T12:46:06.1499348Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 9153 2022-08-17T12:46:07.5385604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:07.5386480Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:07.5387300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:07.5387782Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:07.5431555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:07.5432031Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:07.5434801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:07.5435274Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:07.5589226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:07.5589698Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:07.5592439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:07.5592915Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:07.5762729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:07.5763203Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:07.5765881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:07.5766350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:07.7068485Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphhn6ixda 2022-08-17T12:46:07.7069586Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphhn6ixda/_remote_module_non_scriptable.py 2022-08-17T12:46:07.7115394Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphexhvgpz 2022-08-17T12:46:07.7118107Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphexhvgpz/_remote_module_non_scriptable.py 2022-08-17T12:46:07.7326932Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd1ph00tt 2022-08-17T12:46:07.7328357Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd1ph00tt/_remote_module_non_scriptable.py 2022-08-17T12:46:07.7451860Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp28sox_h3 2022-08-17T12:46:07.7454288Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp28sox_h3/_remote_module_non_scriptable.py 2022-08-17T12:46:08.1364428Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:46:08.1391202Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:46:08.1612226Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:46:08.1667499Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:46:08.2470409Z fi_getinfo: -61 2022-08-17T12:46:08.2473895Z fi_getinfo: -61 2022-08-17T12:46:08.2630450Z fi_getinfo: -61 2022-08-17T12:46:08.2684761Z fi_getinfo: -61 2022-08-17T12:46:11.7651067Z ok (7.117s) 2022-08-17T12:46:11.7651281Z 2022-08-17T12:46:11.7651685Z ---------------------------------------------------------------------- 2022-08-17T12:46:11.7652033Z Ran 1 test in 7.117s 2022-08-17T12:46:11.7652207Z 2022-08-17T12:46:11.7652304Z OK 2022-08-17T12:46:11.7652452Z 2022-08-17T12:46:11.7652578Z Generating XML reports... 2022-08-17T12:46:11.7689413Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124604.xml 2022-08-17T12:46:13.5373021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:13.5373542Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:13.5374757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:13.5375274Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:13.7102142Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk9gco9_9 2022-08-17T12:46:13.7105074Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk9gco9_9/_remote_module_non_scriptable.py 2022-08-17T12:46:14.1324437Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:46:14.1340108Z 2022-08-17T12:46:14.1340253Z Running tests... 2022-08-17T12:46:14.1340700Z ---------------------------------------------------------------------- 2022-08-17T12:46:15.6397530Z test_device_map_cpu_to_gpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:46:15.6581424Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9501 2022-08-17T12:46:15.6587915Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9502 2022-08-17T12:46:15.6594117Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 9503 2022-08-17T12:46:15.6601503Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 9504 2022-08-17T12:46:17.0331012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:17.0331533Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:17.0332158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:17.0332654Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:17.0690650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:17.0691158Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:17.0693586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:17.0694062Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:17.0762311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:17.0762790Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:17.0765724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:17.0766204Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:17.0780159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:17.0780934Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:17.0783771Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:17.0784256Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:17.2012434Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7hyvpd4m 2022-08-17T12:46:17.2013385Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7hyvpd4m/_remote_module_non_scriptable.py 2022-08-17T12:46:17.2422566Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw2mnq8dz 2022-08-17T12:46:17.2425743Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw2mnq8dz/_remote_module_non_scriptable.py 2022-08-17T12:46:17.2460673Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdcmmf2nw 2022-08-17T12:46:17.2463476Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdcmmf2nw/_remote_module_non_scriptable.py 2022-08-17T12:46:17.2487669Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz9ulzajh 2022-08-17T12:46:17.2490447Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz9ulzajh/_remote_module_non_scriptable.py 2022-08-17T12:46:17.6175521Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:46:17.6726860Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:46:17.6830946Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:46:17.6841288Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:46:17.7310181Z fi_getinfo: -61 2022-08-17T12:46:17.7745630Z fi_getinfo: -61 2022-08-17T12:46:17.7849290Z fi_getinfo: -61 2022-08-17T12:46:17.7861812Z fi_getinfo: -61 2022-08-17T12:46:21.2752929Z ok (7.141s) 2022-08-17T12:46:21.2753160Z 2022-08-17T12:46:21.2753590Z ---------------------------------------------------------------------- 2022-08-17T12:46:21.2753950Z Ran 1 test in 7.141s 2022-08-17T12:46:21.2754120Z 2022-08-17T12:46:21.2754199Z OK 2022-08-17T12:46:21.2754338Z 2022-08-17T12:46:21.2754480Z Generating XML reports... 2022-08-17T12:46:21.2790948Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124614.xml 2022-08-17T12:46:23.0496588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:23.0497391Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:23.0498332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:23.0498852Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:23.2238700Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpumxmxgjs 2022-08-17T12:46:23.2241040Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpumxmxgjs/_remote_module_non_scriptable.py 2022-08-17T12:46:23.6517209Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:46:23.6532954Z 2022-08-17T12:46:23.6533385Z Running tests... 2022-08-17T12:46:23.6534319Z ---------------------------------------------------------------------- 2022-08-17T12:46:25.1771753Z test_device_map_gpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:46:25.1956602Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9852 2022-08-17T12:46:25.1963026Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9853 2022-08-17T12:46:25.1969085Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 9854 2022-08-17T12:46:25.1975863Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 9855 2022-08-17T12:46:26.5905425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:26.5906414Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:26.5907601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:26.5908520Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:26.5932336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:26.5933228Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:26.5934771Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:26.5935682Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:26.5936857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:26.5937802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:26.5938956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:26.5939872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:26.6100986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:26.6101905Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:26.6103906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:26.6104842Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:26.7620240Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe2ip6_pv 2022-08-17T12:46:26.7621408Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe2ip6_pv/_remote_module_non_scriptable.py 2022-08-17T12:46:26.7676569Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7ca1iw4l 2022-08-17T12:46:26.7678811Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7ca1iw4l/_remote_module_non_scriptable.py 2022-08-17T12:46:26.7679856Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiwq2dmim 2022-08-17T12:46:26.7681577Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiwq2dmim/_remote_module_non_scriptable.py 2022-08-17T12:46:26.7839473Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjjjtt9uz 2022-08-17T12:46:26.7841350Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjjjtt9uz/_remote_module_non_scriptable.py 2022-08-17T12:46:27.1940530Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:46:27.1971240Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:46:27.1977306Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:46:27.2157775Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:46:27.2983414Z fi_getinfo: -61 2022-08-17T12:46:27.2990021Z fi_getinfo: -61 2022-08-17T12:46:27.2996256Z fi_getinfo: -61 2022-08-17T12:46:27.3176842Z fi_getinfo: -61 2022-08-17T12:46:30.8128214Z ok (7.159s) 2022-08-17T12:46:30.8128406Z 2022-08-17T12:46:30.8128807Z ---------------------------------------------------------------------- 2022-08-17T12:46:30.8129157Z Ran 1 test in 7.159s 2022-08-17T12:46:30.8129344Z 2022-08-17T12:46:30.8129433Z OK 2022-08-17T12:46:30.8129572Z 2022-08-17T12:46:30.8129689Z Generating XML reports... 2022-08-17T12:46:30.8167647Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124623.xml 2022-08-17T12:46:32.5918378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:32.5918902Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:32.5920260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:32.5920750Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:32.7661769Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpod7yornj 2022-08-17T12:46:32.7664639Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpod7yornj/_remote_module_non_scriptable.py 2022-08-17T12:46:33.1905030Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:46:33.1921374Z 2022-08-17T12:46:33.1921777Z Running tests... 2022-08-17T12:46:33.1922222Z ---------------------------------------------------------------------- 2022-08-17T12:46:34.6971523Z test_device_map_gpu_default_to_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:46:34.7127008Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/80008 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.520s) 2022-08-17T12:46:34.7127599Z 2022-08-17T12:46:34.7127884Z ---------------------------------------------------------------------- 2022-08-17T12:46:34.7128225Z Ran 1 test in 1.520s 2022-08-17T12:46:34.7128391Z 2022-08-17T12:46:34.7128501Z OK (skipped=1) 2022-08-17T12:46:34.7128659Z 2022-08-17T12:46:34.7128788Z Generating XML reports... 2022-08-17T12:46:34.7160675Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124633.xml 2022-08-17T12:46:36.4687649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:36.4688316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:36.4689748Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:36.4690251Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:36.6426894Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp58dqstga 2022-08-17T12:46:36.6429636Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp58dqstga/_remote_module_non_scriptable.py 2022-08-17T12:46:37.0989486Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:46:37.1005816Z 2022-08-17T12:46:37.1006101Z Running tests... 2022-08-17T12:46:37.1006550Z ---------------------------------------------------------------------- 2022-08-17T12:46:38.6174698Z test_device_map_gpu_mixed_1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:46:38.6360154Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10233 2022-08-17T12:46:38.6366658Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10234 2022-08-17T12:46:38.6373118Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 10235 2022-08-17T12:46:38.6379733Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 10236 2022-08-17T12:46:40.0262627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:40.0263736Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:40.0265237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:40.0265823Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:40.0296830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:40.0297324Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:40.0299989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:40.0300480Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:40.0550189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:40.0550826Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:40.0553865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:40.0554353Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:40.1104173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:40.1104660Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:40.1107006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:40.1107497Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:40.1969641Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0zyszd5j 2022-08-17T12:46:40.1971490Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0zyszd5j/_remote_module_non_scriptable.py 2022-08-17T12:46:40.1990230Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjv5mrpb_ 2022-08-17T12:46:40.1992926Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjv5mrpb_/_remote_module_non_scriptable.py 2022-08-17T12:46:40.2312376Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp67hnvf3m 2022-08-17T12:46:40.2315031Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp67hnvf3m/_remote_module_non_scriptable.py 2022-08-17T12:46:40.2783111Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv5fyefdw 2022-08-17T12:46:40.2784208Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv5fyefdw/_remote_module_non_scriptable.py 2022-08-17T12:46:40.6207386Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:46:40.6222572Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:46:40.6599285Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:46:40.6858399Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:46:40.7229306Z fi_getinfo: -61 2022-08-17T12:46:40.7240597Z fi_getinfo: -61 2022-08-17T12:46:40.7617764Z fi_getinfo: -61 2022-08-17T12:46:40.7876981Z fi_getinfo: -61 2022-08-17T12:46:46.5585739Z ok (9.458s) 2022-08-17T12:46:46.5585957Z 2022-08-17T12:46:46.5586343Z ---------------------------------------------------------------------- 2022-08-17T12:46:46.5586698Z Ran 1 test in 9.458s 2022-08-17T12:46:46.5586874Z 2022-08-17T12:46:46.5586969Z OK 2022-08-17T12:46:46.5587115Z 2022-08-17T12:46:46.5587254Z Generating XML reports... 2022-08-17T12:46:46.5623500Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124637.xml 2022-08-17T12:46:48.3401309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:48.3402136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:48.3403150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:48.3403654Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:48.5132903Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuhechg6q 2022-08-17T12:46:48.5135462Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuhechg6q/_remote_module_non_scriptable.py 2022-08-17T12:46:48.9389162Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:46:48.9405728Z 2022-08-17T12:46:48.9406111Z Running tests... 2022-08-17T12:46:48.9406625Z ---------------------------------------------------------------------- 2022-08-17T12:46:50.4515167Z test_device_map_gpu_mixed_2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:46:50.4701173Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10584 2022-08-17T12:46:50.4707923Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10585 2022-08-17T12:46:50.4714667Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 10586 2022-08-17T12:46:50.4721272Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 10587 2022-08-17T12:46:51.8682279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:51.8683228Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:51.8684411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:51.8685385Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:51.8709104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:51.8710037Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:51.8712909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:51.8713852Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:51.8781194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:51.8782081Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:51.8784589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:51.8785463Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:51.8897552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:46:51.8898529Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:46:51.8900249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:46:51.8901205Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:46:52.0380751Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx649qnzn 2022-08-17T12:46:52.0382159Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx649qnzn/_remote_module_non_scriptable.py 2022-08-17T12:46:52.0406286Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1v6wgexf 2022-08-17T12:46:52.0408911Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1v6wgexf/_remote_module_non_scriptable.py 2022-08-17T12:46:52.0455040Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9yzwjva1 2022-08-17T12:46:52.0457792Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9yzwjva1/_remote_module_non_scriptable.py 2022-08-17T12:46:52.0643031Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz41k8cmg 2022-08-17T12:46:52.0645998Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz41k8cmg/_remote_module_non_scriptable.py 2022-08-17T12:46:52.4654191Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:46:52.4692113Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:46:52.4714473Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:46:52.4940684Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:46:52.5675351Z fi_getinfo: -61 2022-08-17T12:46:52.5711135Z fi_getinfo: -61 2022-08-17T12:46:52.5732514Z fi_getinfo: -61 2022-08-17T12:46:52.5960874Z fi_getinfo: -61 2022-08-17T12:46:58.3927674Z ok (9.452s) 2022-08-17T12:46:58.3927882Z 2022-08-17T12:46:58.3928412Z ---------------------------------------------------------------------- 2022-08-17T12:46:58.3928888Z Ran 1 test in 9.452s 2022-08-17T12:46:58.3929063Z 2022-08-17T12:46:58.3929158Z OK 2022-08-17T12:46:58.3929296Z 2022-08-17T12:46:58.3929433Z Generating XML reports... 2022-08-17T12:46:58.3968697Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124648.xml 2022-08-17T12:47:00.1736270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:00.1736803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:00.1737785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:00.1738307Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:00.3480197Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp97fhqb0n 2022-08-17T12:47:00.3481975Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp97fhqb0n/_remote_module_non_scriptable.py 2022-08-17T12:47:00.7737277Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:47:00.7753335Z 2022-08-17T12:47:00.7753570Z Running tests... 2022-08-17T12:47:00.7754009Z ---------------------------------------------------------------------- 2022-08-17T12:47:02.3034723Z test_device_map_gpu_mixed_3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:47:02.3219587Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10935 2022-08-17T12:47:02.3226106Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10936 2022-08-17T12:47:02.3232422Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 10937 2022-08-17T12:47:02.3239369Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 10938 2022-08-17T12:47:03.7198283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:03.7199279Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:03.7200451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:03.7201385Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:03.7352424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:03.7353336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:03.7355353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:03.7356309Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:03.7373209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:03.7374177Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:03.7375910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:03.7376823Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:03.7430743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:03.7431604Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:03.7433428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:03.7434533Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:03.8891973Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp799en0e9 2022-08-17T12:47:03.8893406Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp799en0e9/_remote_module_non_scriptable.py 2022-08-17T12:47:03.9108982Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppici1_fn 2022-08-17T12:47:03.9112020Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppici1_fn/_remote_module_non_scriptable.py 2022-08-17T12:47:03.9131838Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptzp58eow 2022-08-17T12:47:03.9134837Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptzp58eow/_remote_module_non_scriptable.py 2022-08-17T12:47:03.9195657Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp3mccsqe 2022-08-17T12:47:03.9198341Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp3mccsqe/_remote_module_non_scriptable.py 2022-08-17T12:47:04.3211650Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:47:04.3420054Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:47:04.3494781Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:47:04.3524944Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:47:04.4232943Z fi_getinfo: -61 2022-08-17T12:47:04.4439337Z fi_getinfo: -61 2022-08-17T12:47:04.4514640Z fi_getinfo: -61 2022-08-17T12:47:04.4543495Z fi_getinfo: -61 2022-08-17T12:47:10.2463002Z ok (9.471s) 2022-08-17T12:47:10.2463325Z 2022-08-17T12:47:10.2464002Z ---------------------------------------------------------------------- 2022-08-17T12:47:10.2464380Z Ran 1 test in 9.471s 2022-08-17T12:47:10.2464551Z 2022-08-17T12:47:10.2464649Z OK 2022-08-17T12:47:10.2464769Z 2022-08-17T12:47:10.2464907Z Generating XML reports... 2022-08-17T12:47:10.2500199Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124700.xml 2022-08-17T12:47:11.9827255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:11.9827761Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:11.9832775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:11.9833284Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:12.1569738Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj35ztsyj 2022-08-17T12:47:12.1571404Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj35ztsyj/_remote_module_non_scriptable.py 2022-08-17T12:47:12.5830619Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:47:12.5846199Z 2022-08-17T12:47:12.5846641Z Running tests... 2022-08-17T12:47:12.5847107Z ---------------------------------------------------------------------- 2022-08-17T12:47:14.1076187Z test_device_map_gpu_mixed_4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:47:14.1259782Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11286 2022-08-17T12:47:14.1266541Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11287 2022-08-17T12:47:14.1273103Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 11288 2022-08-17T12:47:14.1279822Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 11289 2022-08-17T12:47:15.5214177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:15.5215082Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:15.5216157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:15.5216803Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:15.5421663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:15.5422312Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:15.5424761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:15.5425275Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:15.5438368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:15.5438870Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:15.5441624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:15.5442127Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:15.5527795Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:15.5528275Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:15.5531400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:15.5531900Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:15.6897497Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb3wttvz_ 2022-08-17T12:47:15.6899024Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb3wttvz_/_remote_module_non_scriptable.py 2022-08-17T12:47:15.7142222Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdkex9den 2022-08-17T12:47:15.7144927Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdkex9den/_remote_module_non_scriptable.py 2022-08-17T12:47:15.7150217Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppfztial6 2022-08-17T12:47:15.7153015Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppfztial6/_remote_module_non_scriptable.py 2022-08-17T12:47:15.7289720Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp3j02erq 2022-08-17T12:47:15.7292576Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp3j02erq/_remote_module_non_scriptable.py 2022-08-17T12:47:16.1144450Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:47:16.1404163Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:47:16.1416099Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:47:16.1640500Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:47:16.2165321Z fi_getinfo: -61 2022-08-17T12:47:16.2423403Z fi_getinfo: -61 2022-08-17T12:47:16.2434526Z fi_getinfo: -61 2022-08-17T12:47:16.2659505Z fi_getinfo: -61 2022-08-17T12:47:22.0487206Z ok (9.464s) 2022-08-17T12:47:22.0487396Z 2022-08-17T12:47:22.0487802Z ---------------------------------------------------------------------- 2022-08-17T12:47:22.0488144Z Ran 1 test in 9.464s 2022-08-17T12:47:22.0488310Z 2022-08-17T12:47:22.0488404Z OK 2022-08-17T12:47:22.0488540Z 2022-08-17T12:47:22.0488662Z Generating XML reports... 2022-08-17T12:47:22.0524971Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124712.xml 2022-08-17T12:47:23.8311872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:23.8312397Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:23.8314600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:23.8315101Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:24.0046429Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_wbko9s_ 2022-08-17T12:47:24.0049745Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_wbko9s_/_remote_module_non_scriptable.py 2022-08-17T12:47:24.4289730Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:47:24.4307085Z 2022-08-17T12:47:24.4307380Z Running tests... 2022-08-17T12:47:24.4307846Z ---------------------------------------------------------------------- 2022-08-17T12:47:25.9422928Z test_device_map_gpu_mixed_5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:47:25.9608965Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11637 2022-08-17T12:47:25.9615689Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11638 2022-08-17T12:47:25.9622796Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 11639 2022-08-17T12:47:25.9630596Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 11640 2022-08-17T12:47:27.3735725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:27.3736246Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:27.3737213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:27.3737731Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:27.3792133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:27.3792612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:27.3795415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:27.3795905Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:27.3864566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:27.3865039Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:27.3868179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:27.3868674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:27.3977483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:27.3978220Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:27.3981007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:27.3981494Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:27.5411791Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8_3yjs2z 2022-08-17T12:47:27.5413138Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8_3yjs2z/_remote_module_non_scriptable.py 2022-08-17T12:47:27.5463981Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpohi7iws7 2022-08-17T12:47:27.5466761Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpohi7iws7/_remote_module_non_scriptable.py 2022-08-17T12:47:27.5637547Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5_3ltanf 2022-08-17T12:47:27.5640669Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5_3ltanf/_remote_module_non_scriptable.py 2022-08-17T12:47:27.5709101Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcctuq6ad 2022-08-17T12:47:27.5711880Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcctuq6ad/_remote_module_non_scriptable.py 2022-08-17T12:47:27.9711737Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:47:27.9764318Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:47:28.0005090Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:47:28.0027057Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:47:28.0732437Z fi_getinfo: -61 2022-08-17T12:47:28.0783716Z fi_getinfo: -61 2022-08-17T12:47:28.1022988Z fi_getinfo: -61 2022-08-17T12:47:28.1043978Z fi_getinfo: -61 2022-08-17T12:47:33.8837277Z ok (9.453s) 2022-08-17T12:47:33.8837800Z 2022-08-17T12:47:33.8838371Z ---------------------------------------------------------------------- 2022-08-17T12:47:33.8838723Z Ran 1 test in 9.453s 2022-08-17T12:47:33.8838893Z 2022-08-17T12:47:33.8838999Z OK 2022-08-17T12:47:33.8839137Z 2022-08-17T12:47:33.8839256Z Generating XML reports... 2022-08-17T12:47:33.8875182Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124724.xml 2022-08-17T12:47:35.6397344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:35.6397867Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:35.6399081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:35.6399581Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:35.8132481Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2r_1hth9 2022-08-17T12:47:35.8134781Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2r_1hth9/_remote_module_non_scriptable.py 2022-08-17T12:47:36.2386245Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:47:36.2402374Z 2022-08-17T12:47:36.2402663Z Running tests... 2022-08-17T12:47:36.2403102Z ---------------------------------------------------------------------- 2022-08-17T12:47:37.7485299Z test_device_map_gpu_mixed_6 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:47:37.7670525Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11996 2022-08-17T12:47:37.7676726Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11997 2022-08-17T12:47:37.7682866Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 11998 2022-08-17T12:47:37.7689664Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 11999 2022-08-17T12:47:39.1567972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:39.1568959Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:39.1570128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:39.1571043Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:39.1605107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:39.1605983Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:39.1608608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:39.1609577Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:39.2121484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:39.2122402Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:39.2124042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:39.2125010Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:39.2151531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:39.2152447Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:39.2154220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:39.2155191Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:39.3265371Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzk5ecljr 2022-08-17T12:47:39.3266782Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzk5ecljr/_remote_module_non_scriptable.py 2022-08-17T12:47:39.3299428Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq9a_uy55 2022-08-17T12:47:39.3302327Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq9a_uy55/_remote_module_non_scriptable.py 2022-08-17T12:47:39.3845564Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl4jd50yd 2022-08-17T12:47:39.3846561Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl4jd50yd/_remote_module_non_scriptable.py 2022-08-17T12:47:39.3873083Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1rlqopb4 2022-08-17T12:47:39.3875957Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1rlqopb4/_remote_module_non_scriptable.py 2022-08-17T12:47:39.7483093Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:47:39.7562295Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:47:39.8038457Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:47:39.8195123Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:47:39.8504351Z fi_getinfo: -61 2022-08-17T12:47:39.8580944Z fi_getinfo: -61 2022-08-17T12:47:39.9056724Z fi_getinfo: -61 2022-08-17T12:47:39.9213786Z fi_getinfo: -61 2022-08-17T12:47:45.6898915Z ok (9.449s) 2022-08-17T12:47:45.6899134Z 2022-08-17T12:47:45.6899535Z ---------------------------------------------------------------------- 2022-08-17T12:47:45.6899901Z Ran 1 test in 9.450s 2022-08-17T12:47:45.6900073Z 2022-08-17T12:47:45.6900151Z OK 2022-08-17T12:47:45.6900293Z 2022-08-17T12:47:45.6900430Z Generating XML reports... 2022-08-17T12:47:45.6938953Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124736.xml 2022-08-17T12:47:47.4506274Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:47.4506788Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:47.4507893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:47.4508374Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:47.6255688Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3o06iw6o 2022-08-17T12:47:47.6258097Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3o06iw6o/_remote_module_non_scriptable.py 2022-08-17T12:47:48.0484771Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:47:48.0501185Z 2022-08-17T12:47:48.0501329Z Running tests... 2022-08-17T12:47:48.0502214Z ---------------------------------------------------------------------- 2022-08-17T12:47:49.5629581Z test_device_map_gpu_mixed_7 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:47:49.5807553Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12355 2022-08-17T12:47:49.5814526Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12356 2022-08-17T12:47:49.5821112Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 12357 2022-08-17T12:47:49.5828329Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 12358 2022-08-17T12:47:50.9840200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:50.9840851Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:50.9841726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:50.9842236Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:50.9897084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:50.9897789Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:50.9900282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:50.9900922Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:50.9903569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:50.9904328Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:50.9907687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:50.9908396Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:51.1026558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:51.1027073Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:51.1028438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:51.1028929Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:51.1572476Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpaa9nbl2v 2022-08-17T12:47:51.1575359Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpaa9nbl2v/_remote_module_non_scriptable.py 2022-08-17T12:47:51.1616012Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprhk1npg_ 2022-08-17T12:47:51.1618530Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprhk1npg_/_remote_module_non_scriptable.py 2022-08-17T12:47:51.1646891Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0yhhh90o 2022-08-17T12:47:51.1649617Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0yhhh90o/_remote_module_non_scriptable.py 2022-08-17T12:47:51.2715883Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdv3_yyjb 2022-08-17T12:47:51.2716910Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdv3_yyjb/_remote_module_non_scriptable.py 2022-08-17T12:47:51.5867805Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:47:51.5922310Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:47:51.5948970Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:47:51.6943342Z fi_getinfo: -61 2022-08-17T12:47:51.6947503Z fi_getinfo: -61 2022-08-17T12:47:51.6994700Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:47:51.7012620Z fi_getinfo: -61 2022-08-17T12:47:51.8013005Z fi_getinfo: -61 2022-08-17T12:47:57.5036153Z ok (9.453s) 2022-08-17T12:47:57.5036354Z 2022-08-17T12:47:57.5036768Z ---------------------------------------------------------------------- 2022-08-17T12:47:57.5037110Z Ran 1 test in 9.453s 2022-08-17T12:47:57.5037263Z 2022-08-17T12:47:57.5037353Z OK 2022-08-17T12:47:57.5037489Z 2022-08-17T12:47:57.5037626Z Generating XML reports... 2022-08-17T12:47:57.5073729Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124748.xml 2022-08-17T12:47:59.2693928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:47:59.2694459Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:47:59.2695341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:47:59.2695841Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:47:59.4429605Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdauavy1j 2022-08-17T12:47:59.4432060Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdauavy1j/_remote_module_non_scriptable.py 2022-08-17T12:47:59.8708201Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:47:59.8723981Z 2022-08-17T12:47:59.8724277Z Running tests... 2022-08-17T12:47:59.8724767Z ---------------------------------------------------------------------- 2022-08-17T12:48:01.3930750Z test_device_map_gpu_mixed_8 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:48:01.4109065Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12714 2022-08-17T12:48:01.4116253Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12715 2022-08-17T12:48:01.4122788Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 12716 2022-08-17T12:48:01.4129356Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 12717 2022-08-17T12:48:02.8145704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:02.8146237Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:02.8147571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:02.8148078Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:02.8181759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:02.8182257Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:02.8184910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:02.8185404Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:02.8292655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:02.8293132Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:02.8295993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:02.8296660Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:02.8468695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:02.8469175Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:02.8471944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:02.8472439Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:02.9866686Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg5i5r7pj 2022-08-17T12:48:02.9868092Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr4o9aol8 2022-08-17T12:48:02.9868655Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg5i5r7pj/_remote_module_non_scriptable.py 2022-08-17T12:48:02.9871171Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr4o9aol8/_remote_module_non_scriptable.py 2022-08-17T12:48:02.9990461Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqx0e_48y 2022-08-17T12:48:02.9993426Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqx0e_48y/_remote_module_non_scriptable.py 2022-08-17T12:48:03.0197669Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvw447io6 2022-08-17T12:48:03.0200447Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvw447io6/_remote_module_non_scriptable.py 2022-08-17T12:48:03.4164111Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:48:03.4256138Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:48:03.4276515Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:48:03.4489534Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:48:03.5185993Z fi_getinfo: -61 2022-08-17T12:48:03.5276977Z fi_getinfo: -61 2022-08-17T12:48:03.5296074Z fi_getinfo: -61 2022-08-17T12:48:03.5510925Z fi_getinfo: -61 2022-08-17T12:48:09.3332182Z ok (9.460s) 2022-08-17T12:48:09.3332455Z 2022-08-17T12:48:09.3332841Z ---------------------------------------------------------------------- 2022-08-17T12:48:09.3333182Z Ran 1 test in 9.461s 2022-08-17T12:48:09.3333349Z 2022-08-17T12:48:09.3333447Z OK 2022-08-17T12:48:09.3333567Z 2022-08-17T12:48:09.3333700Z Generating XML reports... 2022-08-17T12:48:09.3370804Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124759.xml 2022-08-17T12:48:11.0622729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:11.0623234Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:11.0624577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:11.0625094Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:11.2302443Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn5z2grbq 2022-08-17T12:48:11.2305271Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn5z2grbq/_remote_module_non_scriptable.py 2022-08-17T12:48:11.6426273Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:48:11.6442059Z 2022-08-17T12:48:11.6442393Z Running tests... 2022-08-17T12:48:11.6442873Z ---------------------------------------------------------------------- 2022-08-17T12:48:13.1254670Z test_device_map_gpu_mixed_self_1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:48:13.1430520Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13073 2022-08-17T12:48:13.1436722Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13074 2022-08-17T12:48:13.1442840Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 13075 2022-08-17T12:48:13.1449134Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 13076 2022-08-17T12:48:14.5303928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:14.5304739Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:14.5305603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:14.5306085Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:14.5330002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:14.5330480Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:14.5333710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:14.5334198Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:14.5560701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:14.5561172Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:14.5564152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:14.5564627Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:14.5597083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:14.5597557Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:14.5600216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:14.5600708Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:14.7057167Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqgh_gwin 2022-08-17T12:48:14.7058115Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp55m2vmv4 2022-08-17T12:48:14.7058691Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqgh_gwin/_remote_module_non_scriptable.py 2022-08-17T12:48:14.7061356Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp55m2vmv4/_remote_module_non_scriptable.py 2022-08-17T12:48:14.7270641Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuajg3ipx 2022-08-17T12:48:14.7273327Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuajg3ipx/_remote_module_non_scriptable.py 2022-08-17T12:48:14.7311497Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6xavupys 2022-08-17T12:48:14.7314491Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6xavupys/_remote_module_non_scriptable.py 2022-08-17T12:48:15.1394359Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:48:15.1399931Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:48:15.1454843Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:48:15.1681580Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:48:15.2414377Z fi_getinfo: -61 2022-08-17T12:48:15.2418301Z fi_getinfo: -61 2022-08-17T12:48:15.2473509Z fi_getinfo: -61 2022-08-17T12:48:15.2700282Z fi_getinfo: -61 2022-08-17T12:48:21.1654315Z ok (9.521s) 2022-08-17T12:48:21.1654552Z 2022-08-17T12:48:21.1654945Z ---------------------------------------------------------------------- 2022-08-17T12:48:21.1655628Z Ran 1 test in 9.521s 2022-08-17T12:48:21.1655798Z 2022-08-17T12:48:21.1655873Z OK 2022-08-17T12:48:21.1656009Z 2022-08-17T12:48:21.1656144Z Generating XML reports... 2022-08-17T12:48:21.1691817Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124811.xml 2022-08-17T12:48:22.9012838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:22.9013350Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:22.9014263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:22.9014761Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:23.0784891Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg3nwg1jt 2022-08-17T12:48:23.0787657Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg3nwg1jt/_remote_module_non_scriptable.py 2022-08-17T12:48:23.5120486Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:48:23.5138737Z 2022-08-17T12:48:23.5139130Z Running tests... 2022-08-17T12:48:23.5140018Z ---------------------------------------------------------------------- 2022-08-17T12:48:25.0149572Z test_device_map_gpu_mixed_self_2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:48:25.0329570Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13424 2022-08-17T12:48:25.0336325Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13425 2022-08-17T12:48:25.0342509Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 13426 2022-08-17T12:48:25.0349350Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 13427 2022-08-17T12:48:26.4407217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:26.4408223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:26.4409436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:26.4410342Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:26.4622306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:26.4623215Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:26.4625222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:26.4626094Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:26.4683456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:26.4684374Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:26.4687068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:26.4688044Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:26.4696555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:26.4697493Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:26.4699202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:26.4700184Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:26.6088461Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv9thbwv7 2022-08-17T12:48:26.6089934Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv9thbwv7/_remote_module_non_scriptable.py 2022-08-17T12:48:26.6292988Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5pnajh1m 2022-08-17T12:48:26.6294768Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5pnajh1m/_remote_module_non_scriptable.py 2022-08-17T12:48:26.6438999Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuxn71vlv 2022-08-17T12:48:26.6440921Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuxn71vlv/_remote_module_non_scriptable.py 2022-08-17T12:48:26.6507190Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpanojj1be 2022-08-17T12:48:26.6509696Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpanojj1be/_remote_module_non_scriptable.py 2022-08-17T12:48:27.0329872Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:48:27.0563276Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:48:27.0742836Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:48:27.0846322Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:48:27.1348979Z fi_getinfo: -61 2022-08-17T12:48:27.1582168Z fi_getinfo: -61 2022-08-17T12:48:27.1761401Z fi_getinfo: -61 2022-08-17T12:48:27.1864901Z fi_getinfo: -61 2022-08-17T12:48:32.9557567Z ok (9.442s) 2022-08-17T12:48:32.9557837Z 2022-08-17T12:48:32.9558224Z ---------------------------------------------------------------------- 2022-08-17T12:48:32.9558571Z Ran 1 test in 9.442s 2022-08-17T12:48:32.9558740Z 2022-08-17T12:48:32.9558836Z OK 2022-08-17T12:48:32.9558955Z 2022-08-17T12:48:32.9559098Z Generating XML reports... 2022-08-17T12:48:32.9596014Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124823.xml 2022-08-17T12:48:34.7252155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:34.7252674Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:34.7254077Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:34.7254574Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:34.8992067Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa7szppse 2022-08-17T12:48:34.8994052Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa7szppse/_remote_module_non_scriptable.py 2022-08-17T12:48:35.3228437Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:48:35.3243938Z 2022-08-17T12:48:35.3244078Z Running tests... 2022-08-17T12:48:35.3244810Z ---------------------------------------------------------------------- 2022-08-17T12:48:36.8439504Z test_device_map_gpu_mixed_self_3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:48:36.8623071Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13775 2022-08-17T12:48:36.8629792Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13776 2022-08-17T12:48:36.8636004Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 13777 2022-08-17T12:48:36.8642394Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 13778 2022-08-17T12:48:38.2432642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:38.2433167Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:38.2433940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:38.2434801Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:38.2467296Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:38.2467771Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:38.2470585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:38.2471082Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:38.3171802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:38.3172310Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:38.3174324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:38.3174845Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:38.3208000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:38.3208480Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:38.3211246Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:38.3211718Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:38.4101481Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk_85vizw 2022-08-17T12:48:38.4103552Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk_85vizw/_remote_module_non_scriptable.py 2022-08-17T12:48:38.4141580Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpebez9lh4 2022-08-17T12:48:38.4144646Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpebez9lh4/_remote_module_non_scriptable.py 2022-08-17T12:48:38.4866152Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqsm48azv 2022-08-17T12:48:38.4867545Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqsm48azv/_remote_module_non_scriptable.py 2022-08-17T12:48:38.4954217Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp7auzt_5 2022-08-17T12:48:38.4956532Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp7auzt_5/_remote_module_non_scriptable.py 2022-08-17T12:48:38.8291368Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:48:38.8348393Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:48:38.9032464Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:48:38.9270522Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:48:38.9313476Z fi_getinfo: -61 2022-08-17T12:48:38.9369966Z fi_getinfo: -61 2022-08-17T12:48:39.0051541Z fi_getinfo: -61 2022-08-17T12:48:39.0288372Z fi_getinfo: -61 2022-08-17T12:48:44.8852514Z ok (9.560s) 2022-08-17T12:48:44.8852722Z 2022-08-17T12:48:44.8853135Z ---------------------------------------------------------------------- 2022-08-17T12:48:44.8853479Z Ran 1 test in 9.561s 2022-08-17T12:48:44.8853646Z 2022-08-17T12:48:44.8853738Z OK 2022-08-17T12:48:44.8853875Z 2022-08-17T12:48:44.8853993Z Generating XML reports... 2022-08-17T12:48:44.8891379Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124835.xml 2022-08-17T12:48:46.6588996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:46.6589527Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:46.6590695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:46.6591178Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:46.8374728Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp9t8xumd 2022-08-17T12:48:46.8377037Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp9t8xumd/_remote_module_non_scriptable.py 2022-08-17T12:48:47.2648377Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:48:47.2664858Z 2022-08-17T12:48:47.2665254Z Running tests... 2022-08-17T12:48:47.2665722Z ---------------------------------------------------------------------- 2022-08-17T12:48:48.7702210Z test_device_map_gpu_mixed_self_4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:48:48.7889892Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14126 2022-08-17T12:48:48.7896228Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14127 2022-08-17T12:48:48.7902919Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 14128 2022-08-17T12:48:48.7910087Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 14129 2022-08-17T12:48:50.1785957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:50.1786483Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:50.1788061Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:50.1788870Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:50.1808403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:50.1808896Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:50.1811722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:50.1812231Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:50.2282412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:50.2282890Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:50.2285982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:50.2286480Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:50.2656575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:50.2657049Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:50.2660026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:50.2660812Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:50.3514130Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdfxokboo 2022-08-17T12:48:50.3515999Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdfxokboo/_remote_module_non_scriptable.py 2022-08-17T12:48:50.3534731Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6bbt5k_e 2022-08-17T12:48:50.3537737Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6bbt5k_e/_remote_module_non_scriptable.py 2022-08-17T12:48:50.3955155Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppe25o0cz 2022-08-17T12:48:50.3957181Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppe25o0cz/_remote_module_non_scriptable.py 2022-08-17T12:48:50.4413389Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6f9wh56m 2022-08-17T12:48:50.4415243Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6f9wh56m/_remote_module_non_scriptable.py 2022-08-17T12:48:50.7725412Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:48:50.7738756Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:48:50.8073536Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:48:50.8656872Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:48:50.8747438Z fi_getinfo: -61 2022-08-17T12:48:50.8759064Z fi_getinfo: -61 2022-08-17T12:48:50.9094014Z fi_getinfo: -61 2022-08-17T12:48:50.9678570Z fi_getinfo: -61 2022-08-17T12:48:56.8118638Z ok (9.545s) 2022-08-17T12:48:56.8118886Z 2022-08-17T12:48:56.8119294Z ---------------------------------------------------------------------- 2022-08-17T12:48:56.8119643Z Ran 1 test in 9.545s 2022-08-17T12:48:56.8119811Z 2022-08-17T12:48:56.8119905Z OK 2022-08-17T12:48:56.8120044Z 2022-08-17T12:48:56.8120172Z Generating XML reports... 2022-08-17T12:48:56.8156254Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124847.xml 2022-08-17T12:48:58.5882238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:48:58.5882748Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:48:58.5883962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:48:58.5884465Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:48:58.7654604Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppwz5g6ki 2022-08-17T12:48:58.7657658Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppwz5g6ki/_remote_module_non_scriptable.py 2022-08-17T12:48:59.1904581Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:48:59.1921753Z 2022-08-17T12:48:59.1921982Z Running tests... 2022-08-17T12:48:59.1922431Z ---------------------------------------------------------------------- 2022-08-17T12:49:00.7094008Z test_device_map_gpu_mixed_self_5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:49:00.7280831Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14477 2022-08-17T12:49:00.7287380Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14478 2022-08-17T12:49:00.7293403Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 14479 2022-08-17T12:49:00.7299959Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 14480 2022-08-17T12:49:02.1270395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:02.1271245Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:02.1271877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:02.1272379Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:02.1751011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:02.1751511Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:02.1753185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:02.1753679Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:02.1962877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:02.1963336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:02.1965759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:02.1966252Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:02.2192238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:02.2192702Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:02.2195683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:02.2980799Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:02.2981322Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5c40e2v3 2022-08-17T12:49:02.2981885Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5c40e2v3/_remote_module_non_scriptable.py 2022-08-17T12:49:02.3435504Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcw1ryipj 2022-08-17T12:49:02.3437011Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcw1ryipj/_remote_module_non_scriptable.py 2022-08-17T12:49:02.3647115Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkauom0fv 2022-08-17T12:49:02.3647887Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkauom0fv/_remote_module_non_scriptable.py 2022-08-17T12:49:02.3928211Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv1zvnpbe 2022-08-17T12:49:02.3929279Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv1zvnpbe/_remote_module_non_scriptable.py 2022-08-17T12:49:02.7151865Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:49:02.7669677Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:49:02.7815332Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:49:02.8171917Z fi_getinfo: -61 2022-08-17T12:49:02.8185811Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:49:02.8688535Z fi_getinfo: -61 2022-08-17T12:49:02.8838252Z fi_getinfo: -61 2022-08-17T12:49:02.9205778Z fi_getinfo: -61 2022-08-17T12:49:08.7512356Z ok (9.559s) 2022-08-17T12:49:08.7512559Z 2022-08-17T12:49:08.7512953Z ---------------------------------------------------------------------- 2022-08-17T12:49:08.7513303Z Ran 1 test in 9.559s 2022-08-17T12:49:08.7513471Z 2022-08-17T12:49:08.7513567Z OK 2022-08-17T12:49:08.7514538Z 2022-08-17T12:49:08.7514672Z Generating XML reports... 2022-08-17T12:49:08.7552228Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124859.xml 2022-08-17T12:49:10.5335974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:10.5336515Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:10.5338532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:10.5339017Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:10.7073453Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz31u7v4z 2022-08-17T12:49:10.7076404Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz31u7v4z/_remote_module_non_scriptable.py 2022-08-17T12:49:11.1327168Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:49:11.1343510Z 2022-08-17T12:49:11.1343674Z Running tests... 2022-08-17T12:49:11.1344114Z ---------------------------------------------------------------------- 2022-08-17T12:49:12.6450367Z test_device_map_gpu_mixed_self_6 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:49:12.6637004Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14828 2022-08-17T12:49:12.6642298Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14829 2022-08-17T12:49:12.6648637Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 14830 2022-08-17T12:49:12.6655848Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 14831 2022-08-17T12:49:14.0595525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:14.0596245Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:14.0596866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:14.0597373Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:14.0600049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:14.0600535Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:14.0603244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:14.0603745Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:14.1291780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:14.1292276Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:14.1293888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:14.1294391Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:14.1578676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:14.1579151Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:14.1581826Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:14.1582320Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:14.2325176Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpebeob1xe 2022-08-17T12:49:14.2326195Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd3azgoma 2022-08-17T12:49:14.2326767Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpebeob1xe/_remote_module_non_scriptable.py 2022-08-17T12:49:14.2328861Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd3azgoma/_remote_module_non_scriptable.py 2022-08-17T12:49:14.3032614Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2wcwah7m 2022-08-17T12:49:14.3033891Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2wcwah7m/_remote_module_non_scriptable.py 2022-08-17T12:49:14.3315517Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp07rhhrdx 2022-08-17T12:49:14.3316850Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp07rhhrdx/_remote_module_non_scriptable.py 2022-08-17T12:49:14.6525598Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:49:14.6553445Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:49:14.7330574Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:49:14.7547233Z fi_getinfo: -61 2022-08-17T12:49:14.7559663Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:49:14.7573370Z fi_getinfo: -61 2022-08-17T12:49:14.8349763Z fi_getinfo: -61 2022-08-17T12:49:14.8579160Z fi_getinfo: -61 2022-08-17T12:49:20.5878544Z ok (9.453s) 2022-08-17T12:49:20.5878756Z 2022-08-17T12:49:20.5879156Z ---------------------------------------------------------------------- 2022-08-17T12:49:20.5879497Z Ran 1 test in 9.453s 2022-08-17T12:49:20.5879667Z 2022-08-17T12:49:20.5879760Z OK 2022-08-17T12:49:20.5879896Z 2022-08-17T12:49:20.5880033Z Generating XML reports... 2022-08-17T12:49:20.5915293Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124911.xml 2022-08-17T12:49:22.3507264Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:22.3507878Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:22.3508977Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:22.3509470Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:22.5248353Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpajlag9ng 2022-08-17T12:49:22.5250943Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpajlag9ng/_remote_module_non_scriptable.py 2022-08-17T12:49:22.9534915Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:49:22.9551486Z 2022-08-17T12:49:22.9551733Z Running tests... 2022-08-17T12:49:22.9552155Z ---------------------------------------------------------------------- 2022-08-17T12:49:24.4613125Z test_device_map_gpu_mixed_self_7 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:49:24.4790427Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15179 2022-08-17T12:49:24.4796953Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15180 2022-08-17T12:49:24.4802847Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 15181 2022-08-17T12:49:24.4809034Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 15182 2022-08-17T12:49:25.8606398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:25.8607014Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:25.8608042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:25.8608523Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:25.8819967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:25.8820452Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:25.8822759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:25.8823878Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:25.9000182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:25.9000645Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:25.9003357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:25.9003845Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:25.9027446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:25.9028130Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:25.9030880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:25.9031370Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:26.0273271Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzsny9twy 2022-08-17T12:49:26.0274526Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzsny9twy/_remote_module_non_scriptable.py 2022-08-17T12:49:26.0516774Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0pce1_uv 2022-08-17T12:49:26.0519295Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0pce1_uv/_remote_module_non_scriptable.py 2022-08-17T12:49:26.0697228Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpurs96gge 2022-08-17T12:49:26.0699974Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpurs96gge/_remote_module_non_scriptable.py 2022-08-17T12:49:26.0757591Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjt0mij3k 2022-08-17T12:49:26.0760421Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjt0mij3k/_remote_module_non_scriptable.py 2022-08-17T12:49:26.4427397Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:49:26.4804876Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:49:26.4891003Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:49:26.5094147Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:49:26.5448717Z fi_getinfo: -61 2022-08-17T12:49:26.5828450Z fi_getinfo: -61 2022-08-17T12:49:26.5913313Z fi_getinfo: -61 2022-08-17T12:49:26.6114407Z fi_getinfo: -61 2022-08-17T12:49:32.4014526Z ok (9.446s) 2022-08-17T12:49:32.4014748Z 2022-08-17T12:49:32.4015152Z ---------------------------------------------------------------------- 2022-08-17T12:49:32.4015497Z Ran 1 test in 9.446s 2022-08-17T12:49:32.4015645Z 2022-08-17T12:49:32.4015739Z OK 2022-08-17T12:49:32.4015873Z 2022-08-17T12:49:32.4016020Z Generating XML reports... 2022-08-17T12:49:32.4052309Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124922.xml 2022-08-17T12:49:34.1764693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:34.1765249Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:34.1766394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:34.1766913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:34.3500364Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzwkiuxnb 2022-08-17T12:49:34.3502896Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzwkiuxnb/_remote_module_non_scriptable.py 2022-08-17T12:49:34.7779241Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:49:34.7795292Z 2022-08-17T12:49:34.7795496Z Running tests... 2022-08-17T12:49:34.7795948Z ---------------------------------------------------------------------- 2022-08-17T12:49:36.2884693Z test_device_map_gpu_mixed_self_8 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:49:36.3069497Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15530 2022-08-17T12:49:36.3075601Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15531 2022-08-17T12:49:36.3081787Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 15532 2022-08-17T12:49:36.3088082Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 15533 2022-08-17T12:49:37.6995042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:37.6995562Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:37.6996167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:37.6996630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:37.7353795Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:37.7354264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:37.7356845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:37.7357311Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:37.7475524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:37.7475995Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:37.7478826Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:37.7479287Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:37.7829722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:37.7830197Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:37.7832586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:37.7833047Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:37.8663380Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2fjxyj9q 2022-08-17T12:49:37.8665077Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2fjxyj9q/_remote_module_non_scriptable.py 2022-08-17T12:49:37.9041347Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0a0m31vh 2022-08-17T12:49:37.9043619Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0a0m31vh/_remote_module_non_scriptable.py 2022-08-17T12:49:37.9216001Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfv4n3ne0 2022-08-17T12:49:37.9217948Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfv4n3ne0/_remote_module_non_scriptable.py 2022-08-17T12:49:37.9490554Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbxo9mupl 2022-08-17T12:49:37.9491802Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbxo9mupl/_remote_module_non_scriptable.py 2022-08-17T12:49:38.2800268Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:49:38.3260286Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:49:38.3542979Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:49:38.3657323Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:49:38.3820651Z fi_getinfo: -61 2022-08-17T12:49:38.4279179Z fi_getinfo: -61 2022-08-17T12:49:38.4565755Z fi_getinfo: -61 2022-08-17T12:49:38.4677274Z fi_getinfo: -61 2022-08-17T12:49:44.2292877Z ok (9.449s) 2022-08-17T12:49:44.2293100Z 2022-08-17T12:49:44.2293496Z ---------------------------------------------------------------------- 2022-08-17T12:49:44.2293817Z Ran 1 test in 9.450s 2022-08-17T12:49:44.2293988Z 2022-08-17T12:49:44.2294084Z OK 2022-08-17T12:49:44.2294221Z 2022-08-17T12:49:44.2295312Z Generating XML reports... 2022-08-17T12:49:44.2331592Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124934.xml 2022-08-17T12:49:46.0061172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:46.0061684Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:46.0062596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:46.0063067Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:46.1815484Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnaqnnsod 2022-08-17T12:49:46.1817536Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnaqnnsod/_remote_module_non_scriptable.py 2022-08-17T12:49:46.6050613Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:49:46.6067453Z 2022-08-17T12:49:46.6067939Z Running tests... 2022-08-17T12:49:46.6068542Z ---------------------------------------------------------------------- 2022-08-17T12:49:48.1340765Z test_device_map_gpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:49:48.1528798Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15881 2022-08-17T12:49:48.1534888Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15882 2022-08-17T12:49:48.1541100Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 15883 2022-08-17T12:49:48.1548155Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 15884 2022-08-17T12:49:49.5305729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:49.5306220Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:49.5307317Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:49.5307822Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:49.5721416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:49.5721907Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:49.5724337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:49.5724828Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:49.5746265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:49.5746733Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:49.5749735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:49.5750229Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:49.6193839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:49.6194337Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:49.6196331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:49.6196815Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:49.6978638Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6m03s502 2022-08-17T12:49:49.6980833Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6m03s502/_remote_module_non_scriptable.py 2022-08-17T12:49:49.7432290Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4n1hl3qo 2022-08-17T12:49:49.7434613Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4n1hl3qo/_remote_module_non_scriptable.py 2022-08-17T12:49:49.7465719Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqe80j8js 2022-08-17T12:49:49.7468552Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqe80j8js/_remote_module_non_scriptable.py 2022-08-17T12:49:49.7917344Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy0311scn 2022-08-17T12:49:49.7918471Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy0311scn/_remote_module_non_scriptable.py 2022-08-17T12:49:50.1101104Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:49:50.1651066Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:49:50.1670689Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:49:50.2067384Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:49:50.2231575Z fi_getinfo: -61 2022-08-17T12:49:50.2673618Z fi_getinfo: -61 2022-08-17T12:49:50.2689091Z fi_getinfo: -61 2022-08-17T12:49:50.3086060Z fi_getinfo: -61 2022-08-17T12:49:53.8698507Z ok (7.263s) 2022-08-17T12:49:53.8698726Z 2022-08-17T12:49:53.8699119Z ---------------------------------------------------------------------- 2022-08-17T12:49:53.8699471Z Ran 1 test in 7.263s 2022-08-17T12:49:53.8699636Z 2022-08-17T12:49:53.8699720Z OK 2022-08-17T12:49:53.8699858Z 2022-08-17T12:49:53.8699993Z Generating XML reports... 2022-08-17T12:49:53.8737477Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124946.xml 2022-08-17T12:49:55.5991732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:55.5992283Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:55.5993411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:55.5993895Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:55.7724059Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqdxoo4pw 2022-08-17T12:49:55.7725964Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqdxoo4pw/_remote_module_non_scriptable.py 2022-08-17T12:49:56.1941592Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:49:56.1958361Z 2022-08-17T12:49:56.1958656Z Running tests... 2022-08-17T12:49:56.1959100Z ---------------------------------------------------------------------- 2022-08-17T12:49:57.6978888Z test_device_map_gpu_non_default_to_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:49:57.7156233Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16228 2022-08-17T12:49:57.7162696Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16229 2022-08-17T12:49:57.7169316Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 16230 2022-08-17T12:49:57.7175808Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 16231 2022-08-17T12:49:59.1126204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:59.1127111Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:59.1127724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:59.1128183Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:59.1247971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:59.1249189Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:59.1251028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:59.1251520Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:59.1484167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:59.1484648Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:59.1487204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:59.1487683Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:59.1800575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:49:59.1801069Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:49:59.1803521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:49:59.1804007Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:49:59.2809999Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdkd3fobb 2022-08-17T12:49:59.2811364Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdkd3fobb/_remote_module_non_scriptable.py 2022-08-17T12:49:59.2931674Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1lmffy77 2022-08-17T12:49:59.2934212Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1lmffy77/_remote_module_non_scriptable.py 2022-08-17T12:49:59.3168300Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_acfvxf2 2022-08-17T12:49:59.3170544Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_acfvxf2/_remote_module_non_scriptable.py 2022-08-17T12:49:59.3532612Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf9qenit5 2022-08-17T12:49:59.3533479Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf9qenit5/_remote_module_non_scriptable.py 2022-08-17T12:49:59.7027677Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:49:59.7202723Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:49:59.7323890Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:49:59.7761501Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:49:59.8048370Z fi_getinfo: -61 2022-08-17T12:49:59.8221713Z fi_getinfo: -61 2022-08-17T12:49:59.8343945Z fi_getinfo: -61 2022-08-17T12:49:59.8781095Z fi_getinfo: -61 2022-08-17T12:50:05.7385968Z ok (9.542s) 2022-08-17T12:50:05.7386193Z 2022-08-17T12:50:05.7386595Z ---------------------------------------------------------------------- 2022-08-17T12:50:05.7386946Z Ran 1 test in 9.543s 2022-08-17T12:50:05.7387116Z 2022-08-17T12:50:05.7389965Z OK 2022-08-17T12:50:05.7390419Z 2022-08-17T12:50:05.7390763Z Generating XML reports... 2022-08-17T12:50:05.7423838Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124956.xml 2022-08-17T12:50:07.4727276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:07.4727803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:07.4729062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:07.4729525Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:07.6485117Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6my4hugk 2022-08-17T12:50:07.6487100Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6my4hugk/_remote_module_non_scriptable.py 2022-08-17T12:50:08.0736184Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:50:08.0752935Z 2022-08-17T12:50:08.0753128Z Running tests... 2022-08-17T12:50:08.0753569Z ---------------------------------------------------------------------- 2022-08-17T12:50:09.5744771Z test_device_map_gpu_to_cpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:50:09.5922438Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16587 2022-08-17T12:50:09.5929045Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16588 2022-08-17T12:50:09.5935224Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 16589 2022-08-17T12:50:09.5941938Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 16590 2022-08-17T12:50:11.0077286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:11.0077809Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:11.0079092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:11.0079564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:11.0227885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:11.0228352Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:11.0231449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:11.0231934Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:11.0427449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:11.0427932Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:11.0430997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:11.0431463Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:11.1002038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:11.1002524Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:11.1004224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:11.1004685Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:11.1756271Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2hhflq9e 2022-08-17T12:50:11.1758383Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2hhflq9e/_remote_module_non_scriptable.py 2022-08-17T12:50:11.1925620Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6cvvyaxf 2022-08-17T12:50:11.1928092Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6cvvyaxf/_remote_module_non_scriptable.py 2022-08-17T12:50:11.2090195Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3zv6kct7 2022-08-17T12:50:11.2093003Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3zv6kct7/_remote_module_non_scriptable.py 2022-08-17T12:50:11.2761821Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp89xvftv0 2022-08-17T12:50:11.2762805Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp89xvftv0/_remote_module_non_scriptable.py 2022-08-17T12:50:11.6009846Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:50:11.6172275Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:50:11.6259829Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:50:11.6991174Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:50:11.7031127Z fi_getinfo: -61 2022-08-17T12:50:11.7190835Z fi_getinfo: -61 2022-08-17T12:50:11.7278769Z fi_getinfo: -61 2022-08-17T12:50:11.8011581Z fi_getinfo: -61 2022-08-17T12:50:15.4097580Z ok (7.334s) 2022-08-17T12:50:15.4097807Z 2022-08-17T12:50:15.4098204Z ---------------------------------------------------------------------- 2022-08-17T12:50:15.4098570Z Ran 1 test in 7.334s 2022-08-17T12:50:15.4098737Z 2022-08-17T12:50:15.4098837Z OK 2022-08-17T12:50:15.4098977Z 2022-08-17T12:50:15.4099123Z Generating XML reports... 2022-08-17T12:50:15.4137592Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125008.xml 2022-08-17T12:50:17.1965252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:17.1965751Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:17.1966845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:17.1967328Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:17.3729680Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxzqb3g6t 2022-08-17T12:50:17.3731923Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxzqb3g6t/_remote_module_non_scriptable.py 2022-08-17T12:50:17.7938714Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:50:17.7954931Z 2022-08-17T12:50:17.7955172Z Running tests... 2022-08-17T12:50:17.7955609Z ---------------------------------------------------------------------- 2022-08-17T12:50:19.3110429Z test_device_map_gpu_to_cpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:50:19.3296813Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16938 2022-08-17T12:50:19.3302632Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16939 2022-08-17T12:50:19.3309316Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 16940 2022-08-17T12:50:19.3315695Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 16941 2022-08-17T12:50:20.7106121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:20.7106637Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:20.7107434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:20.7108191Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:20.7183219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:20.7183686Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:20.7184440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:20.7184878Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:20.7187190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:20.7187680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:20.7188437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:20.7188881Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:20.7913618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:20.7914101Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:20.7916128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:20.7916585Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:20.8782266Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvoh8lj4q 2022-08-17T12:50:20.8783750Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvoh8lj4q/_remote_module_non_scriptable.py 2022-08-17T12:50:20.8921704Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd8s803af 2022-08-17T12:50:20.8924514Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd8s803af/_remote_module_non_scriptable.py 2022-08-17T12:50:20.8932739Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdfm30_2u 2022-08-17T12:50:20.8935875Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdfm30_2u/_remote_module_non_scriptable.py 2022-08-17T12:50:20.9650284Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2laj4_vu 2022-08-17T12:50:20.9651885Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2laj4_vu/_remote_module_non_scriptable.py 2022-08-17T12:50:21.3069822Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:50:21.3121450Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:50:21.3182700Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:50:21.3870089Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:50:21.4123978Z fi_getinfo: -61 2022-08-17T12:50:21.4140160Z fi_getinfo: -61 2022-08-17T12:50:21.4199777Z fi_getinfo: -61 2022-08-17T12:50:21.4889287Z fi_getinfo: -61 2022-08-17T12:50:24.9466018Z ok (7.151s) 2022-08-17T12:50:24.9466223Z 2022-08-17T12:50:24.9466631Z ---------------------------------------------------------------------- 2022-08-17T12:50:24.9466978Z Ran 1 test in 7.151s 2022-08-17T12:50:24.9467153Z 2022-08-17T12:50:24.9467249Z OK 2022-08-17T12:50:24.9469831Z 2022-08-17T12:50:24.9470142Z Generating XML reports... 2022-08-17T12:50:24.9503187Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125017.xml 2022-08-17T12:50:26.6865402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:26.6865927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:26.6867859Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:26.6868622Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:26.8619205Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwlnwc7fm 2022-08-17T12:50:26.8621576Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwlnwc7fm/_remote_module_non_scriptable.py 2022-08-17T12:50:27.2883915Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:50:27.2900489Z 2022-08-17T12:50:27.2900866Z Running tests... 2022-08-17T12:50:27.2901335Z ---------------------------------------------------------------------- 2022-08-17T12:50:28.8043493Z test_device_maps_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:50:28.8232350Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17289 2022-08-17T12:50:28.8239050Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17290 2022-08-17T12:50:28.8245733Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 17291 2022-08-17T12:50:28.8252327Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 17292 2022-08-17T12:50:30.2391010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:30.2391948Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:30.2393135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:30.2394084Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:30.2587323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:30.2588247Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:30.2590218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:30.2591189Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:30.2927656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:30.2928538Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:30.2930864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:30.2931800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:30.3507899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:30.3508866Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:30.3510363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:30.3511326Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:30.4073668Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4v2k_wx3 2022-08-17T12:50:30.4075661Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4v2k_wx3/_remote_module_non_scriptable.py 2022-08-17T12:50:30.4245514Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj2_qweha 2022-08-17T12:50:30.4247833Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj2_qweha/_remote_module_non_scriptable.py 2022-08-17T12:50:30.4619007Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpktget1mg 2022-08-17T12:50:30.4620595Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpktget1mg/_remote_module_non_scriptable.py 2022-08-17T12:50:30.5271248Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1gb6v17i 2022-08-17T12:50:30.5272819Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1gb6v17i/_remote_module_non_scriptable.py 2022-08-17T12:50:30.8321036Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:50:30.8429862Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:50:30.8810187Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:50:30.9431620Z fi_getinfo: -61 2022-08-17T12:50:30.9447646Z fi_getinfo: -61 2022-08-17T12:50:30.9511428Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:50:30.9828962Z fi_getinfo: -61 2022-08-17T12:50:31.0530449Z fi_getinfo: -61 2022-08-17T12:50:36.7473084Z ok (9.457s) 2022-08-17T12:50:36.7473344Z 2022-08-17T12:50:36.7473742Z ---------------------------------------------------------------------- 2022-08-17T12:50:36.7474087Z Ran 1 test in 9.457s 2022-08-17T12:50:36.7474255Z 2022-08-17T12:50:36.7474367Z OK 2022-08-17T12:50:36.7474506Z 2022-08-17T12:50:36.7477585Z Generating XML reports... 2022-08-17T12:50:36.7512398Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125027.xml 2022-08-17T12:50:38.5294296Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:38.5294807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:38.5295617Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:38.5296096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:38.7037207Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1hdmtz0h 2022-08-17T12:50:38.7039053Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1hdmtz0h/_remote_module_non_scriptable.py 2022-08-17T12:50:39.1314575Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:50:39.1330755Z 2022-08-17T12:50:39.1331175Z Running tests... 2022-08-17T12:50:39.1331617Z ---------------------------------------------------------------------- 2022-08-17T12:50:40.6333166Z test_device_maps_in_options (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:50:40.6519030Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17648 2022-08-17T12:50:40.6524888Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17649 2022-08-17T12:50:40.6531076Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 17650 2022-08-17T12:50:40.6537535Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 17651 2022-08-17T12:50:42.1220116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:42.1220625Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:42.1221657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:42.1222156Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:42.1253706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:42.1254175Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:42.1257383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:42.1257858Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:42.1273005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:42.1273740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:42.1276341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:42.1276806Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:42.1322419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:42.1322880Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:42.1325835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:42.1326318Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:42.2921779Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpis3hqiqw 2022-08-17T12:50:42.2922814Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpis3hqiqw/_remote_module_non_scriptable.py 2022-08-17T12:50:42.2988018Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa6nqiglr 2022-08-17T12:50:42.2990633Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa6nqiglr/_remote_module_non_scriptable.py 2022-08-17T12:50:42.3011034Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6hqbgmhb 2022-08-17T12:50:42.3014120Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6hqbgmhb/_remote_module_non_scriptable.py 2022-08-17T12:50:42.3087959Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6klajicv 2022-08-17T12:50:42.3090401Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6klajicv/_remote_module_non_scriptable.py 2022-08-17T12:50:42.7228561Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:50:42.7245702Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:50:42.7389175Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:50:42.7440234Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:50:42.8247684Z fi_getinfo: -61 2022-08-17T12:50:42.8270118Z fi_getinfo: -61 2022-08-17T12:50:42.8409284Z fi_getinfo: -61 2022-08-17T12:50:42.8459317Z fi_getinfo: -61 2022-08-17T12:50:48.5773299Z ok (9.444s) 2022-08-17T12:50:48.5773522Z 2022-08-17T12:50:48.5774095Z ---------------------------------------------------------------------- 2022-08-17T12:50:48.5774521Z Ran 1 test in 9.444s 2022-08-17T12:50:48.5774689Z 2022-08-17T12:50:48.5774764Z OK 2022-08-17T12:50:48.5774906Z 2022-08-17T12:50:48.5775062Z Generating XML reports... 2022-08-17T12:50:48.5811029Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125039.xml 2022-08-17T12:50:50.3419717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:50.3420234Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:50.3421095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:50.3421586Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:50.5157364Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9ervy3sq 2022-08-17T12:50:50.5159696Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9ervy3sq/_remote_module_non_scriptable.py 2022-08-17T12:50:50.9443201Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:50:50.9459323Z 2022-08-17T12:50:50.9459556Z Running tests... 2022-08-17T12:50:50.9459986Z ---------------------------------------------------------------------- 2022-08-17T12:50:52.4488262Z test_device_maps_invalid_max_local_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:50:52.4673202Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18007 2022-08-17T12:50:52.4679384Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18008 2022-08-17T12:50:52.4685510Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 18009 2022-08-17T12:50:52.4691844Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 18010 2022-08-17T12:50:53.8656816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:53.8657325Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:53.8658478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:53.8658953Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:53.8810670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:53.8811124Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:53.8813522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:53.8814006Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:53.9135064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:53.9135508Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:53.9138347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:53.9138831Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:53.9464187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:53.9464634Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:53.9467500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:53.9467983Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:54.0342123Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpivin4p91 2022-08-17T12:50:54.0344312Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpivin4p91/_remote_module_non_scriptable.py 2022-08-17T12:50:54.0489576Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv2x3f_at 2022-08-17T12:50:54.0492080Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv2x3f_at/_remote_module_non_scriptable.py 2022-08-17T12:50:54.0819309Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0v44xsln 2022-08-17T12:50:54.0821447Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0v44xsln/_remote_module_non_scriptable.py 2022-08-17T12:50:54.1206336Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg0c9w97q 2022-08-17T12:50:54.1207745Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg0c9w97q/_remote_module_non_scriptable.py 2022-08-17T12:50:54.4524687Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:50:54.4657996Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:50:54.4943391Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:50:54.5451810Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:50:54.5631061Z fi_getinfo: -61 2022-08-17T12:50:54.5676734Z fi_getinfo: -61 2022-08-17T12:50:54.5960460Z fi_getinfo: -61 2022-08-17T12:50:54.6471503Z fi_getinfo: -61 2022-08-17T12:50:55.0766688Z ok (4.130s) 2022-08-17T12:50:55.0767041Z 2022-08-17T12:50:55.0767501Z ---------------------------------------------------------------------- 2022-08-17T12:50:55.0767853Z Ran 1 test in 4.131s 2022-08-17T12:50:55.0768021Z 2022-08-17T12:50:55.0768118Z OK 2022-08-17T12:50:55.0768255Z 2022-08-17T12:50:55.0768389Z Generating XML reports... 2022-08-17T12:50:55.0806569Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125050.xml 2022-08-17T12:50:56.8530604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:50:56.8531416Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:50:56.8532390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:50:56.8532895Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:50:57.0266584Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_hxfh054 2022-08-17T12:50:57.0268426Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_hxfh054/_remote_module_non_scriptable.py 2022-08-17T12:50:57.4508805Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:50:57.4524810Z 2022-08-17T12:50:57.4525103Z Running tests... 2022-08-17T12:50:57.4525523Z ---------------------------------------------------------------------- 2022-08-17T12:50:58.9790219Z test_device_maps_invalid_max_remote_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:50:58.9975966Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18194 2022-08-17T12:50:58.9981867Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18195 2022-08-17T12:50:58.9988699Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 18196 2022-08-17T12:50:58.9994962Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 18197 2022-08-17T12:51:00.3864568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:00.3865078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:00.3866579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:00.3867062Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:00.3888942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:00.3889403Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:00.3892361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:00.3892849Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:00.4293968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:00.4294440Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:00.4296713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:00.4297189Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:00.4791552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:00.4792032Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:00.4795166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:00.4795679Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:00.5561842Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv9bnrkmi 2022-08-17T12:51:00.5563470Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv9bnrkmi/_remote_module_non_scriptable.py 2022-08-17T12:51:00.5579725Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl7atps4w 2022-08-17T12:51:00.5582480Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl7atps4w/_remote_module_non_scriptable.py 2022-08-17T12:51:00.5977410Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8n3vx6mq 2022-08-17T12:51:00.5979510Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8n3vx6mq/_remote_module_non_scriptable.py 2022-08-17T12:51:00.6540226Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0j0n0po4 2022-08-17T12:51:00.6541291Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0j0n0po4/_remote_module_non_scriptable.py 2022-08-17T12:51:00.9739387Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:51:00.9790792Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:51:01.0131537Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:51:01.0760980Z fi_getinfo: -61 2022-08-17T12:51:01.0794575Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:51:01.0807659Z fi_getinfo: -61 2022-08-17T12:51:01.1150769Z fi_getinfo: -61 2022-08-17T12:51:01.1814433Z fi_getinfo: -61 2022-08-17T12:51:01.6072601Z ok (4.154s) 2022-08-17T12:51:01.6072796Z 2022-08-17T12:51:01.6073213Z ---------------------------------------------------------------------- 2022-08-17T12:51:01.6073578Z Ran 1 test in 4.155s 2022-08-17T12:51:01.6073751Z 2022-08-17T12:51:01.6073851Z OK 2022-08-17T12:51:01.6073986Z 2022-08-17T12:51:01.6074106Z Generating XML reports... 2022-08-17T12:51:01.6112532Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125057.xml 2022-08-17T12:51:03.4212385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:03.4213072Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:03.4213884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:03.4214513Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:03.5954916Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiceq71k0 2022-08-17T12:51:03.5957400Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiceq71k0/_remote_module_non_scriptable.py 2022-08-17T12:51:04.0211751Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:51:04.0228167Z 2022-08-17T12:51:04.0228457Z Running tests... 2022-08-17T12:51:04.0228900Z ---------------------------------------------------------------------- 2022-08-17T12:51:05.5443545Z test_device_maps_invalid_min_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:51:05.5623848Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18381 2022-08-17T12:51:05.5630780Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18382 2022-08-17T12:51:05.5637242Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 18383 2022-08-17T12:51:05.5643344Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 18384 2022-08-17T12:51:06.9673778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:06.9674323Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:06.9675303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:06.9675998Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:06.9708567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:06.9709039Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:06.9711563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:06.9712391Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:06.9761747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:06.9762210Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:06.9764830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:06.9765312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:06.9905661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:06.9906134Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:06.9909290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:06.9909785Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:07.1386548Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi66qioh2 2022-08-17T12:51:07.1387646Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjblwpwyr 2022-08-17T12:51:07.1388424Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi66qioh2/_remote_module_non_scriptable.py 2022-08-17T12:51:07.1390587Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjblwpwyr/_remote_module_non_scriptable.py 2022-08-17T12:51:07.1430322Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpckhrmmve 2022-08-17T12:51:07.1433145Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpckhrmmve/_remote_module_non_scriptable.py 2022-08-17T12:51:07.1644855Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprf_5k8rh 2022-08-17T12:51:07.1647665Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprf_5k8rh/_remote_module_non_scriptable.py 2022-08-17T12:51:07.5729243Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:51:07.5758541Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:51:07.5768022Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:51:07.5960698Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:51:07.6751847Z fi_getinfo: -61 2022-08-17T12:51:07.6777755Z fi_getinfo: -61 2022-08-17T12:51:07.6786721Z fi_getinfo: -61 2022-08-17T12:51:07.6982944Z fi_getinfo: -61 2022-08-17T12:51:08.0717366Z ok (4.049s) 2022-08-17T12:51:08.0717699Z 2022-08-17T12:51:08.0718157Z ---------------------------------------------------------------------- 2022-08-17T12:51:08.0718490Z Ran 1 test in 4.049s 2022-08-17T12:51:08.0718658Z 2022-08-17T12:51:08.0718752Z OK 2022-08-17T12:51:08.0718913Z 2022-08-17T12:51:08.0719052Z Generating XML reports... 2022-08-17T12:51:08.0755891Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125104.xml 2022-08-17T12:51:09.8341182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:09.8341853Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:09.8343894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:09.8344428Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:10.0033953Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppz74lcp0 2022-08-17T12:51:10.0036522Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppz74lcp0/_remote_module_non_scriptable.py 2022-08-17T12:51:10.4222222Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:51:10.4239062Z 2022-08-17T12:51:10.4239490Z Running tests... 2022-08-17T12:51:10.4239985Z ---------------------------------------------------------------------- 2022-08-17T12:51:11.9111131Z test_device_maps_many_to_one (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:51:11.9290861Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18556 2022-08-17T12:51:11.9297290Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18557 2022-08-17T12:51:11.9303650Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 18558 2022-08-17T12:51:11.9311171Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 18559 2022-08-17T12:51:13.3342872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:13.3343901Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:13.3347041Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:13.3347540Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:13.3402085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:13.3402560Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:13.3405395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:13.3405964Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:13.3406551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:13.3407004Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:13.3409408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:13.3409898Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:13.3629641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:13.3630098Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:13.3632889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:13.3633366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:13.5034927Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2399zc37 2022-08-17T12:51:13.5036313Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2399zc37/_remote_module_non_scriptable.py 2022-08-17T12:51:13.5122710Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0fy3myhy 2022-08-17T12:51:13.5125633Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0fy3myhy/_remote_module_non_scriptable.py 2022-08-17T12:51:13.5141717Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8r87b4ld 2022-08-17T12:51:13.5144721Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8r87b4ld/_remote_module_non_scriptable.py 2022-08-17T12:51:13.5400497Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfwbgir3x 2022-08-17T12:51:13.5403221Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfwbgir3x/_remote_module_non_scriptable.py 2022-08-17T12:51:13.9382225Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:51:13.9445598Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:51:13.9459173Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:51:13.9744141Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:51:14.0402695Z fi_getinfo: -61 2022-08-17T12:51:14.0465267Z fi_getinfo: -61 2022-08-17T12:51:14.0478892Z fi_getinfo: -61 2022-08-17T12:51:14.0763545Z fi_getinfo: -61 2022-08-17T12:51:14.5387750Z ok (4.115s) 2022-08-17T12:51:14.5387959Z 2022-08-17T12:51:14.5388351Z ---------------------------------------------------------------------- 2022-08-17T12:51:14.5388688Z Ran 1 test in 4.115s 2022-08-17T12:51:14.5388855Z 2022-08-17T12:51:14.5388932Z OK 2022-08-17T12:51:14.5389069Z 2022-08-17T12:51:14.5389207Z Generating XML reports... 2022-08-17T12:51:14.5424988Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125110.xml 2022-08-17T12:51:16.3232718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:16.3233235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:16.3235262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:16.3235802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:16.4973281Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphw3i14u8 2022-08-17T12:51:16.4975754Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphw3i14u8/_remote_module_non_scriptable.py 2022-08-17T12:51:16.9272867Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:51:16.9289224Z 2022-08-17T12:51:16.9289662Z Running tests... 2022-08-17T12:51:16.9290603Z ---------------------------------------------------------------------- 2022-08-17T12:51:18.4604628Z test_device_maps_missing_config (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:51:18.4789891Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18743 2022-08-17T12:51:18.4796143Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18744 2022-08-17T12:51:18.4802521Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 18745 2022-08-17T12:51:18.4809231Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 18746 2022-08-17T12:51:19.8679403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:19.8680126Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:19.8681440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:19.8681946Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:19.9087286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:19.9088067Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:19.9091160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:19.9091723Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:19.9478583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:19.9479254Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:19.9480860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:19.9481664Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:19.9542966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:19.9544405Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:19.9546248Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:19.9547038Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:20.0369133Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3etckqbt 2022-08-17T12:51:20.0370596Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3etckqbt/_remote_module_non_scriptable.py 2022-08-17T12:51:20.0784994Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphb8dnzx_ 2022-08-17T12:51:20.0787380Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphb8dnzx_/_remote_module_non_scriptable.py 2022-08-17T12:51:20.1185223Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxstllnhz 2022-08-17T12:51:20.1186757Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxstllnhz/_remote_module_non_scriptable.py 2022-08-17T12:51:20.1296755Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp64y6620h 2022-08-17T12:51:20.1298589Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp64y6620h/_remote_module_non_scriptable.py 2022-08-17T12:51:20.4505511Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:51:20.4999044Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:51:20.5394062Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:51:20.5526428Z fi_getinfo: -61 2022-08-17T12:51:20.5582649Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:51:20.6018627Z fi_getinfo: -61 2022-08-17T12:51:20.6412328Z fi_getinfo: -61 2022-08-17T12:51:20.6601588Z fi_getinfo: -61 2022-08-17T12:51:23.0951007Z ok (6.166s) 2022-08-17T12:51:23.0951220Z 2022-08-17T12:51:23.0951627Z ---------------------------------------------------------------------- 2022-08-17T12:51:23.0951991Z Ran 1 test in 6.166s 2022-08-17T12:51:23.0952143Z 2022-08-17T12:51:23.0952236Z OK 2022-08-17T12:51:23.0952373Z 2022-08-17T12:51:23.0952508Z Generating XML reports... 2022-08-17T12:51:23.0989067Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125116.xml 2022-08-17T12:51:24.8818061Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:24.8818563Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:24.8820102Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:24.8820584Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:25.0573966Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpopfdza5w 2022-08-17T12:51:25.0576679Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpopfdza5w/_remote_module_non_scriptable.py 2022-08-17T12:51:25.4839274Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:51:25.4856028Z 2022-08-17T12:51:25.4856543Z Running tests... 2022-08-17T12:51:25.4857052Z ---------------------------------------------------------------------- 2022-08-17T12:51:26.9994818Z test_device_maps_missing_config_loop (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:51:27.0174382Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19090 2022-08-17T12:51:27.0180564Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19091 2022-08-17T12:51:27.0188140Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 19092 2022-08-17T12:51:27.0194367Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 19093 2022-08-17T12:51:28.3990613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:28.3991131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:28.3992155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:28.3992661Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:28.4459728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:28.4460219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:28.4462283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:28.4462774Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:28.4495456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:28.4495902Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:28.4498954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:28.4499427Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:28.4968304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:28.4968747Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:28.4971153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:28.4971632Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:28.5681308Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxhqqj3xb 2022-08-17T12:51:28.5683077Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxhqqj3xb/_remote_module_non_scriptable.py 2022-08-17T12:51:28.6148480Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppyry2tmi 2022-08-17T12:51:28.6149680Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppyry2tmi/_remote_module_non_scriptable.py 2022-08-17T12:51:28.6172593Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpza_2h7bh 2022-08-17T12:51:28.6175444Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpza_2h7bh/_remote_module_non_scriptable.py 2022-08-17T12:51:28.6723386Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq2mycui5 2022-08-17T12:51:28.6724372Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq2mycui5/_remote_module_non_scriptable.py 2022-08-17T12:51:28.9857008Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:51:29.0367291Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:51:29.0395955Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:51:29.0878754Z fi_getinfo: -61 2022-08-17T12:51:29.0968881Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:51:29.1387582Z fi_getinfo: -61 2022-08-17T12:51:29.1413537Z fi_getinfo: -61 2022-08-17T12:51:29.1987594Z fi_getinfo: -61 2022-08-17T12:51:31.6328788Z ok (6.147s) 2022-08-17T12:51:31.6329129Z 2022-08-17T12:51:31.6329657Z ---------------------------------------------------------------------- 2022-08-17T12:51:31.6329984Z Ran 1 test in 6.147s 2022-08-17T12:51:31.6330149Z 2022-08-17T12:51:31.6330551Z OK 2022-08-17T12:51:31.6330692Z 2022-08-17T12:51:31.6333533Z Generating XML reports... 2022-08-17T12:51:31.6367740Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125125.xml 2022-08-17T12:51:33.3999768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:33.4000266Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:33.4001223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:33.4001695Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:33.5716719Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt7ep1sit 2022-08-17T12:51:33.5718944Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt7ep1sit/_remote_module_non_scriptable.py 2022-08-17T12:51:33.9893352Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:51:33.9909947Z 2022-08-17T12:51:33.9910240Z Running tests... 2022-08-17T12:51:33.9910672Z ---------------------------------------------------------------------- 2022-08-17T12:51:35.4908250Z test_device_maps_missing_config_not_timeout (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:51:35.5091532Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19437 2022-08-17T12:51:35.5097675Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19438 2022-08-17T12:51:35.5103821Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 19439 2022-08-17T12:51:35.5110822Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 19440 2022-08-17T12:51:36.9017494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:36.9018021Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:36.9018984Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:36.9019450Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:36.9082828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:36.9083302Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:36.9086646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:36.9087113Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:36.9146633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:36.9147105Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:36.9149978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:36.9150758Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:36.9366471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:36.9366932Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:36.9369998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:36.9370474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:37.0688509Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu4ikbxhh 2022-08-17T12:51:37.0689086Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu4ikbxhh/_remote_module_non_scriptable.py 2022-08-17T12:51:37.0767629Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv0t3quuz 2022-08-17T12:51:37.0770502Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv0t3quuz/_remote_module_non_scriptable.py 2022-08-17T12:51:37.0808550Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps5fbv_zb 2022-08-17T12:51:37.0811048Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps5fbv_zb/_remote_module_non_scriptable.py 2022-08-17T12:51:37.1116318Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp20zzuyk 2022-08-17T12:51:37.1119402Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp20zzuyk/_remote_module_non_scriptable.py 2022-08-17T12:51:37.4939223Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:51:37.5050294Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:51:37.5083150Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:51:37.5449637Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:51:37.5961839Z fi_getinfo: -61 2022-08-17T12:51:37.6071155Z fi_getinfo: -61 2022-08-17T12:51:37.6100731Z fi_getinfo: -61 2022-08-17T12:51:37.6468461Z fi_getinfo: -61 2022-08-17T12:51:40.1238984Z ok (6.133s) 2022-08-17T12:51:40.1239314Z 2022-08-17T12:51:40.1239753Z ---------------------------------------------------------------------- 2022-08-17T12:51:40.1240085Z Ran 1 test in 6.133s 2022-08-17T12:51:40.1240250Z 2022-08-17T12:51:40.1240346Z OK 2022-08-17T12:51:40.1240484Z 2022-08-17T12:51:40.1240618Z Generating XML reports... 2022-08-17T12:51:40.1277527Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125133.xml 2022-08-17T12:51:41.8825845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:41.8826349Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:41.8829136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:41.8830036Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:42.0567518Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxhmvrxfr 2022-08-17T12:51:42.0570382Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxhmvrxfr/_remote_module_non_scriptable.py 2022-08-17T12:51:42.4841569Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:51:42.4858091Z 2022-08-17T12:51:42.4858549Z Running tests... 2022-08-17T12:51:42.4859039Z ---------------------------------------------------------------------- 2022-08-17T12:51:43.9966434Z test_device_maps_missing_config_remote (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:51:44.0144100Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19784 2022-08-17T12:51:44.0151423Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19785 2022-08-17T12:51:44.0158044Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 19786 2022-08-17T12:51:44.0164307Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 19787 2022-08-17T12:51:45.4077832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:45.4078442Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:45.4079522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:45.4080002Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:45.4120015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:45.4120498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:45.4123430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:45.4123929Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:45.4237895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:45.4238372Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:45.4241360Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:45.4241843Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:45.4504701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:45.4505367Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:45.4508387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:45.4508866Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:45.5766367Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkqiq_ylu 2022-08-17T12:51:45.5767894Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkqiq_ylu/_remote_module_non_scriptable.py 2022-08-17T12:51:45.5789578Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfytww2_j 2022-08-17T12:51:45.5792213Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfytww2_j/_remote_module_non_scriptable.py 2022-08-17T12:51:45.5936213Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7ze6pn0y 2022-08-17T12:51:45.5938779Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7ze6pn0y/_remote_module_non_scriptable.py 2022-08-17T12:51:45.6311233Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnsugrx47 2022-08-17T12:51:45.6313289Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnsugrx47/_remote_module_non_scriptable.py 2022-08-17T12:51:46.0064207Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:51:46.0069334Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:51:46.0198868Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:51:46.0635052Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:51:46.1083104Z fi_getinfo: -61 2022-08-17T12:51:46.1087234Z fi_getinfo: -61 2022-08-17T12:51:46.1219137Z fi_getinfo: -61 2022-08-17T12:51:46.1655944Z fi_getinfo: -61 2022-08-17T12:51:48.6289952Z ok (6.143s) 2022-08-17T12:51:48.6290175Z 2022-08-17T12:51:48.6290877Z ---------------------------------------------------------------------- 2022-08-17T12:51:48.6291240Z Ran 1 test in 6.143s 2022-08-17T12:51:48.6291410Z 2022-08-17T12:51:48.6291491Z OK 2022-08-17T12:51:48.6291623Z 2022-08-17T12:51:48.6291759Z Generating XML reports... 2022-08-17T12:51:48.6327750Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125142.xml 2022-08-17T12:51:50.3628224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:50.3628856Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:50.3630042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:50.3630839Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:50.5390014Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqa6ght6u 2022-08-17T12:51:50.5393108Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqa6ght6u/_remote_module_non_scriptable.py 2022-08-17T12:51:50.9676566Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:51:50.9693151Z 2022-08-17T12:51:50.9693428Z Running tests... 2022-08-17T12:51:50.9693912Z ---------------------------------------------------------------------- 2022-08-17T12:51:52.4764576Z test_device_maps_missing_config_remote_response (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:51:52.4949146Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20131 2022-08-17T12:51:52.4955761Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20132 2022-08-17T12:51:52.4962421Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 20133 2022-08-17T12:51:52.4969192Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 20134 2022-08-17T12:51:53.8872730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:53.8873469Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:53.8874418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:53.8874897Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:53.8956257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:53.8956720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:53.8959381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:53.8959867Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:53.9114554Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:53.9115044Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:53.9118268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:53.9118747Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:53.9193800Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:53.9194256Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:53.9197359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:53.9197846Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:54.0585624Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptxx81fs1 2022-08-17T12:51:54.0586789Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptxx81fs1/_remote_module_non_scriptable.py 2022-08-17T12:51:54.0620867Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqudowm86 2022-08-17T12:51:54.0623750Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqudowm86/_remote_module_non_scriptable.py 2022-08-17T12:51:54.0852727Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxso82gop 2022-08-17T12:51:54.0855339Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxso82gop/_remote_module_non_scriptable.py 2022-08-17T12:51:54.0875941Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxs4tohmy 2022-08-17T12:51:54.0879294Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxs4tohmy/_remote_module_non_scriptable.py 2022-08-17T12:51:54.4857547Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:51:54.4894715Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:51:54.5106970Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:51:54.5275792Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:51:54.5879135Z fi_getinfo: -61 2022-08-17T12:51:54.5913687Z fi_getinfo: -61 2022-08-17T12:51:54.6127270Z fi_getinfo: -61 2022-08-17T12:51:54.6293644Z fi_getinfo: -61 2022-08-17T12:51:57.0100177Z ok (6.040s) 2022-08-17T12:51:57.0100382Z 2022-08-17T12:51:57.0100794Z ---------------------------------------------------------------------- 2022-08-17T12:51:57.0101145Z Ran 1 test in 6.041s 2022-08-17T12:51:57.0101312Z 2022-08-17T12:51:57.0101405Z OK 2022-08-17T12:51:57.0101541Z 2022-08-17T12:51:57.0101678Z Generating XML reports... 2022-08-17T12:51:57.0138490Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125150.xml 2022-08-17T12:51:58.7860667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:51:58.7861172Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:51:58.7861965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:51:58.7862437Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:51:58.9593668Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppqra5bmf 2022-08-17T12:51:58.9595618Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppqra5bmf/_remote_module_non_scriptable.py 2022-08-17T12:51:59.3844036Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:51:59.3860150Z 2022-08-17T12:51:59.3860436Z Running tests... 2022-08-17T12:51:59.3860874Z ---------------------------------------------------------------------- 2022-08-17T12:52:00.8921303Z test_device_maps_missing_config_response (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:52:00.9103478Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20478 2022-08-17T12:52:00.9110312Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20479 2022-08-17T12:52:00.9116338Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 20480 2022-08-17T12:52:00.9123196Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 20481 2022-08-17T12:52:02.3169849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:02.3170383Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:02.3171456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:02.3171941Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:02.3736543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:02.3737026Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:02.3738652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:02.3739115Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:02.3887634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:02.3888370Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:02.3890842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:02.3891310Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:02.3901329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:02.3901791Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:02.3905109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:02.3905573Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:02.4883949Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_tygeyqe 2022-08-17T12:52:02.4884960Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_tygeyqe/_remote_module_non_scriptable.py 2022-08-17T12:52:02.5429476Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgqh02m_y 2022-08-17T12:52:02.5430477Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgqh02m_y/_remote_module_non_scriptable.py 2022-08-17T12:52:02.5626033Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzpyxwx_u 2022-08-17T12:52:02.5628095Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzpyxwx_u/_remote_module_non_scriptable.py 2022-08-17T12:52:02.5646551Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyudn7zbc 2022-08-17T12:52:02.5649546Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyudn7zbc/_remote_module_non_scriptable.py 2022-08-17T12:52:02.9031568Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:52:02.9703211Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:52:02.9826981Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:52:02.9841086Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:52:03.0052416Z fi_getinfo: -61 2022-08-17T12:52:03.0724346Z fi_getinfo: -61 2022-08-17T12:52:03.0845452Z fi_getinfo: -61 2022-08-17T12:52:03.0859116Z fi_getinfo: -61 2022-08-17T12:52:05.4248620Z ok (6.038s) 2022-08-17T12:52:05.4248846Z 2022-08-17T12:52:05.4249218Z ---------------------------------------------------------------------- 2022-08-17T12:52:05.4249564Z Ran 1 test in 6.039s 2022-08-17T12:52:05.4249728Z 2022-08-17T12:52:05.4249822Z OK 2022-08-17T12:52:05.4249956Z 2022-08-17T12:52:05.4250094Z Generating XML reports... 2022-08-17T12:52:05.4287116Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125159.xml 2022-08-17T12:52:07.1653395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:07.1654431Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:07.1655599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:07.1656087Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:07.3401300Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpltcvxp7l 2022-08-17T12:52:07.3402993Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpltcvxp7l/_remote_module_non_scriptable.py 2022-08-17T12:52:07.7663476Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:52:07.7679923Z 2022-08-17T12:52:07.7680189Z Running tests... 2022-08-17T12:52:07.7680635Z ---------------------------------------------------------------------- 2022-08-17T12:52:09.2810928Z test_device_maps_missing_config_response_loop (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:52:09.2996750Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20825 2022-08-17T12:52:09.3003143Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20826 2022-08-17T12:52:09.3009859Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 20827 2022-08-17T12:52:09.3017601Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 20828 2022-08-17T12:52:10.7019048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:10.7019572Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:10.7020151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:10.7020644Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:10.7146657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:10.7147126Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:10.7150212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:10.7150687Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:10.7293109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:10.7293569Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:10.7296340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:10.7296829Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:10.7322287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:10.7322749Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:10.7326541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:10.7327018Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:10.8701664Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdyp8kj1f 2022-08-17T12:52:10.8702463Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdyp8kj1f/_remote_module_non_scriptable.py 2022-08-17T12:52:10.8876960Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyim1wl91 2022-08-17T12:52:10.8879718Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyim1wl91/_remote_module_non_scriptable.py 2022-08-17T12:52:10.9024896Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3s001jo1 2022-08-17T12:52:10.9028281Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3s001jo1/_remote_module_non_scriptable.py 2022-08-17T12:52:10.9108888Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk4efyp8v 2022-08-17T12:52:10.9112559Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk4efyp8v/_remote_module_non_scriptable.py 2022-08-17T12:52:11.2963513Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:52:11.3243363Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:52:11.3244338Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:52:11.3415305Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:52:11.3982779Z fi_getinfo: -61 2022-08-17T12:52:11.4262920Z fi_getinfo: -61 2022-08-17T12:52:11.4266667Z fi_getinfo: -61 2022-08-17T12:52:11.4434613Z fi_getinfo: -61 2022-08-17T12:52:13.9140712Z ok (6.146s) 2022-08-17T12:52:13.9140919Z 2022-08-17T12:52:13.9141337Z ---------------------------------------------------------------------- 2022-08-17T12:52:13.9141663Z Ran 1 test in 6.146s 2022-08-17T12:52:13.9141826Z 2022-08-17T12:52:13.9141919Z OK 2022-08-17T12:52:13.9142052Z 2022-08-17T12:52:13.9142187Z Generating XML reports... 2022-08-17T12:52:13.9179198Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125207.xml 2022-08-17T12:52:15.6985988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:15.6986672Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:15.6987814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:15.6988327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:15.8739938Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd195brfh 2022-08-17T12:52:15.8742849Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd195brfh/_remote_module_non_scriptable.py 2022-08-17T12:52:16.3031390Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:52:16.3047988Z 2022-08-17T12:52:16.3048231Z Running tests... 2022-08-17T12:52:16.3048887Z ---------------------------------------------------------------------- 2022-08-17T12:52:17.8073903Z test_device_maps_multi_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:52:17.8253155Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21172 2022-08-17T12:52:17.8260116Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21173 2022-08-17T12:52:17.8267185Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 21174 2022-08-17T12:52:17.8273448Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 21175 2022-08-17T12:52:19.2158504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:19.2159032Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:19.2160161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:19.2160643Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:19.2182170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:19.2182614Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:19.2185486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:19.2186239Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:19.2581756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:19.2582205Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:19.2584626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:19.2585096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:19.3062424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:19.3062868Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:19.3065627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:19.3066135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:19.3877283Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmshe5la0 2022-08-17T12:52:19.3879911Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmshe5la0/_remote_module_non_scriptable.py 2022-08-17T12:52:19.3889326Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz0f05dfq 2022-08-17T12:52:19.3892049Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz0f05dfq/_remote_module_non_scriptable.py 2022-08-17T12:52:19.4268855Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfhj8v5yp 2022-08-17T12:52:19.4271758Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfhj8v5yp/_remote_module_non_scriptable.py 2022-08-17T12:52:19.4819947Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3jr1se89 2022-08-17T12:52:19.4821908Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3jr1se89/_remote_module_non_scriptable.py 2022-08-17T12:52:19.8071023Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:52:19.8158637Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:52:19.8426746Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:52:19.9058451Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:52:19.9161609Z fi_getinfo: -61 2022-08-17T12:52:19.9177787Z fi_getinfo: -61 2022-08-17T12:52:19.9446124Z fi_getinfo: -61 2022-08-17T12:52:20.0078708Z fi_getinfo: -61 2022-08-17T12:52:25.7482076Z ok (9.443s) 2022-08-17T12:52:25.7482313Z 2022-08-17T12:52:25.7482702Z ---------------------------------------------------------------------- 2022-08-17T12:52:25.7483071Z Ran 1 test in 9.443s 2022-08-17T12:52:25.7483221Z 2022-08-17T12:52:25.7483317Z OK 2022-08-17T12:52:25.7485305Z 2022-08-17T12:52:25.7485852Z Generating XML reports... 2022-08-17T12:52:25.7521839Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125216.xml 2022-08-17T12:52:27.5457137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:27.5457678Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:27.5458779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:27.5459258Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:27.7196315Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpds6_y8oq 2022-08-17T12:52:27.7198707Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpds6_y8oq/_remote_module_non_scriptable.py 2022-08-17T12:52:28.1444271Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:52:28.1460395Z 2022-08-17T12:52:28.1460560Z Running tests... 2022-08-17T12:52:28.1461020Z ---------------------------------------------------------------------- 2022-08-17T12:52:29.6520359Z test_device_maps_multi_gpu_self (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:52:29.6703969Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21531 2022-08-17T12:52:29.6710269Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21532 2022-08-17T12:52:29.6716626Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 21533 2022-08-17T12:52:29.6723200Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 21534 2022-08-17T12:52:31.0641110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:31.0641608Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:31.0642622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:31.0643126Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:31.0718051Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:31.0718517Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:31.0720925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:31.0721385Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:31.0889673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:31.0890151Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:31.0893203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:31.0893670Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:31.1134777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:31.1135244Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:31.1137722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:31.1138199Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:31.2338488Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnhq4g1h5 2022-08-17T12:52:31.2339622Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnhq4g1h5/_remote_module_non_scriptable.py 2022-08-17T12:52:31.2457318Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxxtwo323 2022-08-17T12:52:31.2459845Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxxtwo323/_remote_module_non_scriptable.py 2022-08-17T12:52:31.2625597Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf8csrrgv 2022-08-17T12:52:31.2628908Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf8csrrgv/_remote_module_non_scriptable.py 2022-08-17T12:52:31.2815959Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp69pbzyol 2022-08-17T12:52:31.2817979Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp69pbzyol/_remote_module_non_scriptable.py 2022-08-17T12:52:31.6664745Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:52:31.6735688Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:52:31.6996933Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:52:31.7033132Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:52:31.7683730Z fi_getinfo: -61 2022-08-17T12:52:31.7756021Z fi_getinfo: -61 2022-08-17T12:52:31.8016292Z fi_getinfo: -61 2022-08-17T12:52:31.8049726Z fi_getinfo: -61 2022-08-17T12:52:37.5931194Z ok (9.447s) 2022-08-17T12:52:37.5931397Z 2022-08-17T12:52:37.5931806Z ---------------------------------------------------------------------- 2022-08-17T12:52:37.5932127Z Ran 1 test in 9.447s 2022-08-17T12:52:37.5932302Z 2022-08-17T12:52:37.5932398Z OK 2022-08-17T12:52:37.5932536Z 2022-08-17T12:52:37.5932673Z Generating XML reports... 2022-08-17T12:52:37.5972768Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125228.xml 2022-08-17T12:52:39.3327159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:39.3328198Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:39.3329427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:39.3330344Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:39.5065019Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjt2gpchd 2022-08-17T12:52:39.5067751Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjt2gpchd/_remote_module_non_scriptable.py 2022-08-17T12:52:39.9345074Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:52:39.9362926Z 2022-08-17T12:52:39.9363397Z Running tests... 2022-08-17T12:52:39.9363944Z ---------------------------------------------------------------------- 2022-08-17T12:52:41.4404117Z test_device_maps_one_to_many (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:52:41.4589657Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21882 2022-08-17T12:52:41.4596229Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21883 2022-08-17T12:52:41.4603004Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 21884 2022-08-17T12:52:41.4609963Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 21885 2022-08-17T12:52:42.8391942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:42.8392898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:42.8394107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:42.8395045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:42.8530105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:42.8530996Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:42.8532539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:42.8533468Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:42.8538215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:42.8539079Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:42.8541549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:42.8542517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:42.8756589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:42.8757877Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:42.8760524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:42.8761494Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:43.0080504Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph2px_zm_ 2022-08-17T12:52:43.0081936Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph2px_zm_/_remote_module_non_scriptable.py 2022-08-17T12:52:43.0278076Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpewkh6kbk 2022-08-17T12:52:43.0281131Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpewkh6kbk/_remote_module_non_scriptable.py 2022-08-17T12:52:43.0281936Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwozeoc1s 2022-08-17T12:52:43.0284946Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwozeoc1s/_remote_module_non_scriptable.py 2022-08-17T12:52:43.0508220Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkpcv0jxp 2022-08-17T12:52:43.0511281Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkpcv0jxp/_remote_module_non_scriptable.py 2022-08-17T12:52:43.4430708Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:52:43.4538781Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:52:43.4607001Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:52:43.4819520Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:52:43.5561056Z fi_getinfo: -61 2022-08-17T12:52:43.9683298Z ok (4.032s) 2022-08-17T12:52:43.9683483Z 2022-08-17T12:52:43.9683898Z ---------------------------------------------------------------------- 2022-08-17T12:52:43.9684236Z Ran 1 test in 4.032s 2022-08-17T12:52:43.9684414Z 2022-08-17T12:52:43.9684508Z OK 2022-08-17T12:52:43.9684649Z 2022-08-17T12:52:43.9684785Z Generating XML reports... 2022-08-17T12:52:43.9721310Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125239.xml 2022-08-17T12:52:45.7768740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:45.7769309Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:45.7770266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:45.7770765Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:45.9527793Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmu6ju7ns 2022-08-17T12:52:45.9530026Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmu6ju7ns/_remote_module_non_scriptable.py 2022-08-17T12:52:46.3771956Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:52:46.3788881Z 2022-08-17T12:52:46.3789346Z Running tests... 2022-08-17T12:52:46.3789858Z ---------------------------------------------------------------------- 2022-08-17T12:52:47.8755257Z test_device_maps_remote (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:52:47.8946750Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22054 2022-08-17T12:52:47.8953036Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22055 2022-08-17T12:52:47.8959535Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 22056 2022-08-17T12:52:47.8965915Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 22057 2022-08-17T12:52:49.3839779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:49.3840307Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:49.3841154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:49.3841636Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:49.3846897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:49.3847356Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:49.3850227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:49.3850850Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:49.3954532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:49.3954968Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:49.3957746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:49.3958224Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:49.4426986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:49.4427425Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:49.4429937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:49.4430415Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:49.5565291Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1hju8rqr 2022-08-17T12:52:49.5566117Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1hju8rqr/_remote_module_non_scriptable.py 2022-08-17T12:52:49.5569669Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsapip1ym 2022-08-17T12:52:49.5572319Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsapip1ym/_remote_module_non_scriptable.py 2022-08-17T12:52:49.5679432Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5hqb9u01 2022-08-17T12:52:49.5682209Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5hqb9u01/_remote_module_non_scriptable.py 2022-08-17T12:52:49.6100017Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpraxqk4eq 2022-08-17T12:52:49.6100943Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpraxqk4eq/_remote_module_non_scriptable.py 2022-08-17T12:52:49.9880851Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:52:49.9882978Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:52:50.0022207Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:52:50.0235948Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:52:50.0901991Z fi_getinfo: -61 2022-08-17T12:52:50.0908146Z fi_getinfo: -61 2022-08-17T12:52:50.1041678Z fi_getinfo: -61 2022-08-17T12:52:50.1254749Z fi_getinfo: -61 2022-08-17T12:52:55.9178537Z ok (9.539s) 2022-08-17T12:52:55.9178753Z 2022-08-17T12:52:55.9179169Z ---------------------------------------------------------------------- 2022-08-17T12:52:55.9179489Z Ran 1 test in 9.539s 2022-08-17T12:52:55.9179664Z 2022-08-17T12:52:55.9179780Z OK 2022-08-17T12:52:55.9179917Z 2022-08-17T12:52:55.9180051Z Generating XML reports... 2022-08-17T12:52:55.9218904Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125246.xml 2022-08-17T12:52:57.6571879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:52:57.6572389Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:52:57.6573229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:52:57.6573707Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:52:57.8231491Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcue497g3 2022-08-17T12:52:57.8233919Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcue497g3/_remote_module_non_scriptable.py 2022-08-17T12:52:58.2308167Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:52:58.2322777Z 2022-08-17T12:52:58.2323149Z Running tests... 2022-08-17T12:52:58.2323678Z ---------------------------------------------------------------------- 2022-08-17T12:52:59.7010497Z test_device_maps_return_to_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:52:59.7186475Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22413 2022-08-17T12:52:59.7192805Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22414 2022-08-17T12:52:59.7199194Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 22415 2022-08-17T12:52:59.7205347Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 22416 2022-08-17T12:53:01.1039827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:01.1040365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:01.1041152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:01.1041635Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:01.1573572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:01.1574075Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:01.1574807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:01.1575283Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:01.1634445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:01.1634921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:01.1635463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:01.1636052Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:01.1637468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:01.1637948Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:01.1638524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:01.1638987Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:01.2736678Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg8g9w7ml 2022-08-17T12:53:01.2737490Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg8g9w7ml/_remote_module_non_scriptable.py 2022-08-17T12:53:01.3298264Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppr5asp6v 2022-08-17T12:53:01.3299852Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppr5asp6v/_remote_module_non_scriptable.py 2022-08-17T12:53:01.3446689Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp00ebqdr 2022-08-17T12:53:01.3448613Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp00ebqdr/_remote_module_non_scriptable.py 2022-08-17T12:53:01.3807348Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpeqxgirtg 2022-08-17T12:53:01.3808892Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpeqxgirtg/_remote_module_non_scriptable.py 2022-08-17T12:53:01.6908095Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:53:01.7542372Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:53:01.7776625Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:53:01.8003531Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:53:02.1280862Z skip: Need at least 4 CUDA devices (3.895s) 2022-08-17T12:53:02.1281222Z 2022-08-17T12:53:02.1281618Z ---------------------------------------------------------------------- 2022-08-17T12:53:02.1281965Z Ran 1 test in 3.896s 2022-08-17T12:53:02.1282131Z 2022-08-17T12:53:02.1282225Z OK (skipped=1) 2022-08-17T12:53:02.1282385Z 2022-08-17T12:53:02.1282517Z Generating XML reports... 2022-08-17T12:53:02.1318809Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125258.xml 2022-08-17T12:53:03.8969409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:03.8969939Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:03.8971106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:03.8971606Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:04.0700718Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb97mzlaz 2022-08-17T12:53:04.0703763Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb97mzlaz/_remote_module_non_scriptable.py 2022-08-17T12:53:04.4952031Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:53:04.4968224Z 2022-08-17T12:53:04.4968648Z Running tests... 2022-08-17T12:53:04.4969151Z ---------------------------------------------------------------------- 2022-08-17T12:53:06.0023264Z test_device_maps_return_to_gpu_self (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:53:06.0201183Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22584 2022-08-17T12:53:06.0207882Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22585 2022-08-17T12:53:06.0213468Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 22586 2022-08-17T12:53:06.0219776Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 22587 2022-08-17T12:53:07.4131089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:07.4131602Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:07.4132476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:07.4132963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:07.4133597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:07.4134264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:07.4137476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:07.4138060Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:07.4644632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:07.4645126Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:07.4647563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:07.4648055Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:07.4713368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:07.4714262Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:07.4716713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:07.4717310Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:07.5871068Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_me9o6lc 2022-08-17T12:53:07.5872355Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_me9o6lc/_remote_module_non_scriptable.py 2022-08-17T12:53:07.5879889Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpalwjjq_u 2022-08-17T12:53:07.5882597Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpalwjjq_u/_remote_module_non_scriptable.py 2022-08-17T12:53:07.6379533Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplg6oe7qz 2022-08-17T12:53:07.6380933Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplg6oe7qz/_remote_module_non_scriptable.py 2022-08-17T12:53:07.6436456Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppqe59xbj 2022-08-17T12:53:07.6439049Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppqe59xbj/_remote_module_non_scriptable.py 2022-08-17T12:53:08.0166412Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:53:08.0182553Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:53:08.0728887Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:53:08.0787028Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:53:08.4290815Z skip: Need at least 4 CUDA devices (3.932s) 2022-08-17T12:53:08.4291110Z 2022-08-17T12:53:08.4291678Z ---------------------------------------------------------------------- 2022-08-17T12:53:08.4292021Z Ran 1 test in 3.932s 2022-08-17T12:53:08.4292187Z 2022-08-17T12:53:08.4292300Z OK (skipped=1) 2022-08-17T12:53:08.4292465Z 2022-08-17T12:53:08.4292597Z Generating XML reports... 2022-08-17T12:53:08.4329393Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125304.xml 2022-08-17T12:53:10.2094162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:10.2094657Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:10.2095424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:10.2095905Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:10.3838356Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpok9jei1a 2022-08-17T12:53:10.3840940Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpok9jei1a/_remote_module_non_scriptable.py 2022-08-17T12:53:10.8039243Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:53:10.8054675Z 2022-08-17T12:53:10.8055467Z Running tests... 2022-08-17T12:53:10.8056153Z ---------------------------------------------------------------------- 2022-08-17T12:53:12.3157173Z test_device_maps_wrong_worker_name (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:53:12.3334566Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22755 2022-08-17T12:53:12.3340844Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22756 2022-08-17T12:53:12.3348988Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 22757 2022-08-17T12:53:12.3355771Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 22758 2022-08-17T12:53:13.7305472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:13.7306428Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:13.7311030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:13.7311965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:13.7365764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:13.7366586Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:13.7368982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:13.7369849Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:13.7497624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:13.7498554Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:13.7500499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:13.7501453Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:13.7667009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:13.7667918Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:13.7669956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:13.7687159Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:13.8999261Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp268bh6s3 2022-08-17T12:53:13.9000587Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp268bh6s3/_remote_module_non_scriptable.py 2022-08-17T12:53:13.9052223Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph7bi7xyt 2022-08-17T12:53:13.9054850Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph7bi7xyt/_remote_module_non_scriptable.py 2022-08-17T12:53:13.9230198Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfys5wxl1 2022-08-17T12:53:13.9232330Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfys5wxl1/_remote_module_non_scriptable.py 2022-08-17T12:53:13.9329981Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkg5svthg 2022-08-17T12:53:13.9332728Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkg5svthg/_remote_module_non_scriptable.py 2022-08-17T12:53:14.3303615Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:53:14.3342403Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:53:14.3500625Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:53:14.3613171Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:53:14.4344889Z fi_getinfo: -61 2022-08-17T12:53:14.4361491Z fi_getinfo: -61 2022-08-17T12:53:14.4518343Z fi_getinfo: -61 2022-08-17T12:53:14.4631084Z fi_getinfo: -61 2022-08-17T12:53:14.8876344Z ok (4.082s) 2022-08-17T12:53:14.8876710Z 2022-08-17T12:53:14.8877129Z ---------------------------------------------------------------------- 2022-08-17T12:53:14.8877474Z Ran 1 test in 4.082s 2022-08-17T12:53:14.8877643Z 2022-08-17T12:53:14.8877739Z OK 2022-08-17T12:53:14.8877858Z 2022-08-17T12:53:14.8877998Z Generating XML reports... 2022-08-17T12:53:14.8913795Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125310.xml 2022-08-17T12:53:16.6575990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:16.6576523Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:16.6577602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:16.6578081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:16.8312651Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprnqfmwpx 2022-08-17T12:53:16.8315348Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprnqfmwpx/_remote_module_non_scriptable.py 2022-08-17T12:53:17.2539687Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:53:17.2556100Z 2022-08-17T12:53:17.2556460Z Running tests... 2022-08-17T12:53:17.2556912Z ---------------------------------------------------------------------- 2022-08-17T12:53:18.7657238Z test_device_mismatch (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:53:18.7833976Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22942 2022-08-17T12:53:18.7840470Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22943 2022-08-17T12:53:18.7846680Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 22944 2022-08-17T12:53:18.7852815Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 22945 2022-08-17T12:53:20.1798590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:20.1799649Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:20.1800846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:20.1801821Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:20.2120694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:20.2121702Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:20.2122892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:20.2123826Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:20.2559266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:20.2560248Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:20.2561421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:20.2562407Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:20.3378898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:20.3380325Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:20.3381554Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:20.3382516Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:20.3481719Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6ddknyy_ 2022-08-17T12:53:20.3483805Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6ddknyy_/_remote_module_non_scriptable.py 2022-08-17T12:53:20.3814624Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppsbfj2po 2022-08-17T12:53:20.3816371Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppsbfj2po/_remote_module_non_scriptable.py 2022-08-17T12:53:20.4255291Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkrbwxjdj 2022-08-17T12:53:20.4256583Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkrbwxjdj/_remote_module_non_scriptable.py 2022-08-17T12:53:20.5211195Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptfrqxg2_ 2022-08-17T12:53:20.5212319Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptfrqxg2_/_remote_module_non_scriptable.py 2022-08-17T12:53:20.7703835Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:53:20.8022191Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:53:20.8406596Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:53:20.8724308Z fi_getinfo: -61 2022-08-17T12:53:20.9040895Z fi_getinfo: -61 2022-08-17T12:53:20.9425577Z fi_getinfo: -61 2022-08-17T12:53:20.9475845Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:53:21.0493836Z fi_getinfo: -61 2022-08-17T12:53:23.8888662Z On WorkerInfo(id=1, name=worker1): 2022-08-17T12:53:23.8901545Z RuntimeError('Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!\nException raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7ff41c2343bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7ff41c22fd8e in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xc83 (0x7ff426842f33 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7ff42684440f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7ff426845b42 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7ff426a1475e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x2a72a1e (0x7ff41eee6a1e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #7: + 0x2a72b26 (0x7ff41eee6b26 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7ff427404b18 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: + 0x31dfc2a (0x7ff428b6dc2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x31e0399 (0x7ff428b6e399 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7ff427439ac2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x32b427 (0x7ff4335c6427 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #13: + 0x32b766 (0x7ff4335c6766 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #14: + 0x1ddc68 (0x55fd16163c68 in /opt/conda/bin/python)\nframe #15: + 0x199499 (0x55fd1611f499 in /opt/conda/bin/python)\nframe #16: + 0x1995fa (0x55fd1611f5fa in /opt/conda/bin/python)\nframe #17: PyNumber_Add + 0x41 (0x55fd160cb4b1 in /opt/conda/bin/python)\nframe #18: _PyEval_EvalFrameDefault + 0x1008 (0x55fd16168098 in /opt/conda/bin/python)\nframe #19: + 0x18f742 (0x55fd16115742 in /opt/conda/bin/python)\nframe #20: _PyObject_Call + 0x20a (0x55fd160cdfaa in /opt/conda/bin/python)\nframe #21: _PyEval_EvalFrameDefault + 0x26e4 (0x55fd16169774 in /opt/conda/bin/python)\nframe #22: + 0x18f742 (0x55fd16115742 in /opt/conda/bin/python)\nframe #23: _PyObject_Call + 0x20a (0x55fd160cdfaa in /opt/conda/bin/python)\nframe #24: + 0xa0ab2a (0x7ff433ca5b2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7ff433ca3d6d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7ff433ca6f05 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7ff433caaaa6 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7ff429ecee1c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7ff433ca6be5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #30: + 0x453a313 (0x7ff429ec8313 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7ff429ec8f08 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7ff429ec3597 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #33: + 0x456a202 (0x7ff429ef8202 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7ff41c2227eb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #35: + 0xdbbf4 (0x7ff44b378bf4 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #36: + 0x76db (0x7ff46b9cc6db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #37: clone + 0x3f (0x7ff46b6f561f in /lib/x86_64-linux-gnu/libc.so.6)\n') 2022-08-17T12:53:23.8910603Z Traceback (most recent call last): 2022-08-17T12:53:23.8911167Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 206, in _run_function 2022-08-17T12:53:23.8911633Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-08-17T12:53:23.8912293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/distributed/rpc/rpc_test.py", line 5911, in _gpu_add_wrong_gpus 2022-08-17T12:53:23.8912715Z return x.cpu() + y.cuda() 2022-08-17T12:53:23.8913095Z RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! 2022-08-17T12:53:23.8913640Z Exception raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first): 2022-08-17T12:53:23.8914669Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7ff41c2343bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:53:23.8915642Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7ff41c22fd8e in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:53:23.8916542Z frame #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xc83 (0x7ff426842f33 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.8917322Z frame #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7ff42684440f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.8918221Z frame #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7ff426845b42 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.8919123Z frame #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7ff426a1475e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.8919860Z frame #6: + 0x2a72a1e (0x7ff41eee6a1e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-08-17T12:53:23.8920526Z frame #7: + 0x2a72b26 (0x7ff41eee6b26 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-08-17T12:53:23.8921328Z frame #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7ff427404b18 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.8922077Z frame #9: + 0x31dfc2a (0x7ff428b6dc2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.8922729Z frame #10: + 0x31e0399 (0x7ff428b6e399 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.8923496Z frame #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7ff427439ac2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.8924187Z frame #12: + 0x32b427 (0x7ff4335c6427 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.8924827Z frame #13: + 0x32b766 (0x7ff4335c6766 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.8925289Z frame #14: + 0x1ddc68 (0x55fd16163c68 in /opt/conda/bin/python) 2022-08-17T12:53:23.8925698Z frame #15: + 0x199499 (0x55fd1611f499 in /opt/conda/bin/python) 2022-08-17T12:53:23.8926088Z frame #16: + 0x1995fa (0x55fd1611f5fa in /opt/conda/bin/python) 2022-08-17T12:53:23.8926562Z frame #17: PyNumber_Add + 0x41 (0x55fd160cb4b1 in /opt/conda/bin/python) 2022-08-17T12:53:23.8926990Z frame #18: _PyEval_EvalFrameDefault + 0x1008 (0x55fd16168098 in /opt/conda/bin/python) 2022-08-17T12:53:23.8927408Z frame #19: + 0x18f742 (0x55fd16115742 in /opt/conda/bin/python) 2022-08-17T12:53:23.8927784Z frame #20: _PyObject_Call + 0x20a (0x55fd160cdfaa in /opt/conda/bin/python) 2022-08-17T12:53:23.8928201Z frame #21: _PyEval_EvalFrameDefault + 0x26e4 (0x55fd16169774 in /opt/conda/bin/python) 2022-08-17T12:53:23.8928612Z frame #22: + 0x18f742 (0x55fd16115742 in /opt/conda/bin/python) 2022-08-17T12:53:23.8928983Z frame #23: _PyObject_Call + 0x20a (0x55fd160cdfaa in /opt/conda/bin/python) 2022-08-17T12:53:23.8929592Z frame #24: + 0xa0ab2a (0x7ff433ca5b2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.8930452Z frame #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7ff433ca3d6d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.8931465Z frame #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7ff433ca6f05 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.8932577Z frame #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7ff433caaaa6 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.8933778Z frame #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7ff429ecee1c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.8935069Z frame #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7ff433ca6be5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.8935954Z frame #30: + 0x453a313 (0x7ff429ec8313 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.8936893Z frame #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7ff429ec8f08 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.8937965Z frame #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7ff429ec3597 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.8938759Z frame #33: + 0x456a202 (0x7ff429ef8202 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.8939423Z frame #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7ff41c2227eb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:53:23.8939930Z frame #35: + 0xdbbf4 (0x7ff44b378bf4 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-08-17T12:53:23.8940481Z frame #36: + 0x76db (0x7ff46b9cc6db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-08-17T12:53:23.8940990Z frame #37: clone + 0x3f (0x7ff46b6f561f in /lib/x86_64-linux-gnu/libc.so.6) 2022-08-17T12:53:23.8941201Z 2022-08-17T12:53:23.8941225Z 2022-08-17T12:53:23.9094835Z On WorkerInfo(id=0, name=worker0): 2022-08-17T12:53:23.9108290Z RuntimeError('Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!\nException raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fd35dc863bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7fd35dc81d8e in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xc83 (0x7fd368294f33 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7fd36829640f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7fd368297b42 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7fd36846675e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x2a72a1e (0x7fd360938a1e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #7: + 0x2a72b26 (0x7fd360938b26 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7fd368e56b18 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: + 0x31dfc2a (0x7fd36a5bfc2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x31e0399 (0x7fd36a5c0399 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7fd368e8bac2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x32b427 (0x7fd375018427 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #13: + 0x32b766 (0x7fd375018766 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #14: + 0x1ddc68 (0x5596f8d26c68 in /opt/conda/bin/python)\nframe #15: + 0x199499 (0x5596f8ce2499 in /opt/conda/bin/python)\nframe #16: + 0x1995fa (0x5596f8ce25fa in /opt/conda/bin/python)\nframe #17: PyNumber_Add + 0x41 (0x5596f8c8e4b1 in /opt/conda/bin/python)\nframe #18: _PyEval_EvalFrameDefault + 0x1008 (0x5596f8d2b098 in /opt/conda/bin/python)\nframe #19: + 0x18f742 (0x5596f8cd8742 in /opt/conda/bin/python)\nframe #20: _PyObject_Call + 0x20a (0x5596f8c90faa in /opt/conda/bin/python)\nframe #21: _PyEval_EvalFrameDefault + 0x26e4 (0x5596f8d2c774 in /opt/conda/bin/python)\nframe #22: + 0x18f742 (0x5596f8cd8742 in /opt/conda/bin/python)\nframe #23: _PyObject_Call + 0x20a (0x5596f8c90faa in /opt/conda/bin/python)\nframe #24: + 0xa0ab2a (0x7fd3756f7b2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fd3756f5d6d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fd3756f8f05 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7fd3756fcaa6 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7fd36b920e1c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fd3756f8be5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #30: + 0x453a313 (0x7fd36b91a313 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fd36b91af08 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fd36b915597 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #33: + 0x456a202 (0x7fd36b94a202 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fd35dc747eb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #35: + 0xdbbf4 (0x7fd38cdcabf4 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #36: + 0x76db (0x7fd3ad41e6db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #37: clone + 0x3f (0x7fd3ad14761f in /lib/x86_64-linux-gnu/libc.so.6)\n') 2022-08-17T12:53:23.9115718Z Traceback (most recent call last): 2022-08-17T12:53:23.9116274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 206, in _run_function 2022-08-17T12:53:23.9116720Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-08-17T12:53:23.9117377Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/distributed/rpc/rpc_test.py", line 5911, in _gpu_add_wrong_gpus 2022-08-17T12:53:23.9117801Z return x.cpu() + y.cuda() 2022-08-17T12:53:23.9118203Z RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! 2022-08-17T12:53:23.9118725Z Exception raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first): 2022-08-17T12:53:23.9119580Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fd35dc863bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:53:23.9120567Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7fd35dc81d8e in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:53:23.9121473Z frame #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xc83 (0x7fd368294f33 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9122267Z frame #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7fd36829640f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9123143Z frame #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7fd368297b42 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9124103Z frame #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7fd36846675e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9124847Z frame #6: + 0x2a72a1e (0x7fd360938a1e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-08-17T12:53:23.9125506Z frame #7: + 0x2a72b26 (0x7fd360938b26 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-08-17T12:53:23.9126332Z frame #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7fd368e56b18 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9127061Z frame #9: + 0x31dfc2a (0x7fd36a5bfc2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9127773Z frame #10: + 0x31e0399 (0x7fd36a5c0399 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9128544Z frame #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7fd368e8bac2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9129256Z frame #12: + 0x32b427 (0x7fd375018427 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9129878Z frame #13: + 0x32b766 (0x7fd375018766 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9130345Z frame #14: + 0x1ddc68 (0x5596f8d26c68 in /opt/conda/bin/python) 2022-08-17T12:53:23.9130753Z frame #15: + 0x199499 (0x5596f8ce2499 in /opt/conda/bin/python) 2022-08-17T12:53:23.9131166Z frame #16: + 0x1995fa (0x5596f8ce25fa in /opt/conda/bin/python) 2022-08-17T12:53:23.9131550Z frame #17: PyNumber_Add + 0x41 (0x5596f8c8e4b1 in /opt/conda/bin/python) 2022-08-17T12:53:23.9131964Z frame #18: _PyEval_EvalFrameDefault + 0x1008 (0x5596f8d2b098 in /opt/conda/bin/python) 2022-08-17T12:53:23.9132386Z frame #19: + 0x18f742 (0x5596f8cd8742 in /opt/conda/bin/python) 2022-08-17T12:53:23.9132765Z frame #20: _PyObject_Call + 0x20a (0x5596f8c90faa in /opt/conda/bin/python) 2022-08-17T12:53:23.9133182Z frame #21: _PyEval_EvalFrameDefault + 0x26e4 (0x5596f8d2c774 in /opt/conda/bin/python) 2022-08-17T12:53:23.9133591Z frame #22: + 0x18f742 (0x5596f8cd8742 in /opt/conda/bin/python) 2022-08-17T12:53:23.9133982Z frame #23: _PyObject_Call + 0x20a (0x5596f8c90faa in /opt/conda/bin/python) 2022-08-17T12:53:23.9134569Z frame #24: + 0xa0ab2a (0x7fd3756f7b2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9135364Z frame #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fd3756f5d6d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9136379Z frame #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fd3756f8f05 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9137491Z frame #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7fd3756fcaa6 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9138704Z frame #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7fd36b920e1c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9140011Z frame #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fd3756f8be5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9140913Z frame #30: + 0x453a313 (0x7fd36b91a313 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9141849Z frame #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fd36b91af08 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9142912Z frame #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fd36b915597 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9144287Z frame #33: + 0x456a202 (0x7fd36b94a202 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9144979Z frame #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fd35dc747eb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:53:23.9145475Z frame #35: + 0xdbbf4 (0x7fd38cdcabf4 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-08-17T12:53:23.9146038Z frame #36: + 0x76db (0x7fd3ad41e6db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-08-17T12:53:23.9146551Z frame #37: clone + 0x3f (0x7fd3ad14761f in /lib/x86_64-linux-gnu/libc.so.6) 2022-08-17T12:53:23.9146785Z 2022-08-17T12:53:23.9146804Z 2022-08-17T12:53:23.9250025Z On WorkerInfo(id=3, name=worker3): 2022-08-17T12:53:23.9262454Z RuntimeError('Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!\nException raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f84255be3bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f84255b9d8e in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xc83 (0x7f842fbccf33 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7f842fbce40f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7f842fbcfb42 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7f842fd9e75e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x2a72a1e (0x7f8428270a1e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #7: + 0x2a72b26 (0x7f8428270b26 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7f843078eb18 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: + 0x31dfc2a (0x7f8431ef7c2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x31e0399 (0x7f8431ef8399 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7f84307c3ac2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x32b427 (0x7f843c950427 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #13: + 0x32b766 (0x7f843c950766 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #14: + 0x1ddc68 (0x55a2ba044c68 in /opt/conda/bin/python)\nframe #15: + 0x199499 (0x55a2ba000499 in /opt/conda/bin/python)\nframe #16: + 0x1995fa (0x55a2ba0005fa in /opt/conda/bin/python)\nframe #17: PyNumber_Add + 0x41 (0x55a2b9fac4b1 in /opt/conda/bin/python)\nframe #18: _PyEval_EvalFrameDefault + 0x1008 (0x55a2ba049098 in /opt/conda/bin/python)\nframe #19: + 0x18f742 (0x55a2b9ff6742 in /opt/conda/bin/python)\nframe #20: _PyObject_Call + 0x20a (0x55a2b9faefaa in /opt/conda/bin/python)\nframe #21: _PyEval_EvalFrameDefault + 0x26e4 (0x55a2ba04a774 in /opt/conda/bin/python)\nframe #22: + 0x18f742 (0x55a2b9ff6742 in /opt/conda/bin/python)\nframe #23: _PyObject_Call + 0x20a (0x55a2b9faefaa in /opt/conda/bin/python)\nframe #24: + 0xa0ab2a (0x7f843d02fb2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f843d02dd6d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f843d030f05 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7f843d034aa6 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7f8433258e1c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f843d030be5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #30: + 0x453a313 (0x7f8433252313 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f8433252f08 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f843324d597 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #33: + 0x456a202 (0x7f8433282202 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f84255ac7eb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #35: + 0xdbbf4 (0x7f8454702bf4 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #36: + 0x76db (0x7f8474d566db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #37: clone + 0x3f (0x7f8474a7f61f in /lib/x86_64-linux-gnu/libc.so.6)\n') 2022-08-17T12:53:23.9270329Z Traceback (most recent call last): 2022-08-17T12:53:23.9270916Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 206, in _run_function 2022-08-17T12:53:23.9271471Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-08-17T12:53:23.9272088Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/distributed/rpc/rpc_test.py", line 5911, in _gpu_add_wrong_gpus 2022-08-17T12:53:23.9272507Z return x.cpu() + y.cuda() 2022-08-17T12:53:23.9272912Z RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! 2022-08-17T12:53:23.9273448Z Exception raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first): 2022-08-17T12:53:23.9274284Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f84255be3bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:53:23.9275345Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f84255b9d8e in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:53:23.9276238Z frame #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xc83 (0x7f842fbccf33 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9277041Z frame #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7f842fbce40f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9277932Z frame #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7f842fbcfb42 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9278815Z frame #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7f842fd9e75e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9279549Z frame #6: + 0x2a72a1e (0x7f8428270a1e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-08-17T12:53:23.9280198Z frame #7: + 0x2a72b26 (0x7f8428270b26 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-08-17T12:53:23.9281008Z frame #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7f843078eb18 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9281750Z frame #9: + 0x31dfc2a (0x7f8431ef7c2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9282374Z frame #10: + 0x31e0399 (0x7f8431ef8399 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9283132Z frame #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7f84307c3ac2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9283837Z frame #12: + 0x32b427 (0x7f843c950427 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9284471Z frame #13: + 0x32b766 (0x7f843c950766 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9284915Z frame #14: + 0x1ddc68 (0x55a2ba044c68 in /opt/conda/bin/python) 2022-08-17T12:53:23.9285325Z frame #15: + 0x199499 (0x55a2ba000499 in /opt/conda/bin/python) 2022-08-17T12:53:23.9285731Z frame #16: + 0x1995fa (0x55a2ba0005fa in /opt/conda/bin/python) 2022-08-17T12:53:23.9286110Z frame #17: PyNumber_Add + 0x41 (0x55a2b9fac4b1 in /opt/conda/bin/python) 2022-08-17T12:53:23.9286523Z frame #18: _PyEval_EvalFrameDefault + 0x1008 (0x55a2ba049098 in /opt/conda/bin/python) 2022-08-17T12:53:23.9286941Z frame #19: + 0x18f742 (0x55a2b9ff6742 in /opt/conda/bin/python) 2022-08-17T12:53:23.9287398Z frame #20: _PyObject_Call + 0x20a (0x55a2b9faefaa in /opt/conda/bin/python) 2022-08-17T12:53:23.9287808Z frame #21: _PyEval_EvalFrameDefault + 0x26e4 (0x55a2ba04a774 in /opt/conda/bin/python) 2022-08-17T12:53:23.9288219Z frame #22: + 0x18f742 (0x55a2b9ff6742 in /opt/conda/bin/python) 2022-08-17T12:53:23.9288610Z frame #23: _PyObject_Call + 0x20a (0x55a2b9faefaa in /opt/conda/bin/python) 2022-08-17T12:53:23.9289193Z frame #24: + 0xa0ab2a (0x7f843d02fb2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9289986Z frame #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f843d02dd6d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9291048Z frame #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f843d030f05 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9292150Z frame #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7f843d034aa6 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9293351Z frame #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7f8433258e1c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9294615Z frame #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f843d030be5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9295501Z frame #30: + 0x453a313 (0x7f8433252313 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9296408Z frame #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f8433252f08 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9297465Z frame #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f843324d597 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9298240Z frame #33: + 0x456a202 (0x7f8433282202 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9298923Z frame #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f84255ac7eb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:53:23.9299405Z frame #35: + 0xdbbf4 (0x7f8454702bf4 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-08-17T12:53:23.9299951Z frame #36: + 0x76db (0x7f8474d566db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-08-17T12:53:23.9300444Z frame #37: clone + 0x3f (0x7f8474a7f61f in /lib/x86_64-linux-gnu/libc.so.6) 2022-08-17T12:53:23.9300669Z 2022-08-17T12:53:23.9300688Z 2022-08-17T12:53:23.9347049Z On WorkerInfo(id=2, name=worker2): 2022-08-17T12:53:23.9359430Z RuntimeError('Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!\nException raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f1a958c93bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f1a958c4d8e in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xc83 (0x7f1a9fed7f33 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7f1a9fed940f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7f1a9fedab42 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7f1aa00a975e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x2a72a1e (0x7f1a9857ba1e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #7: + 0x2a72b26 (0x7f1a9857bb26 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7f1aa0a99b18 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: + 0x31dfc2a (0x7f1aa2202c2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x31e0399 (0x7f1aa2203399 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7f1aa0aceac2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x32b427 (0x7f1aacc5b427 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #13: + 0x32b766 (0x7f1aacc5b766 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #14: + 0x1ddc68 (0x55e7aca44c68 in /opt/conda/bin/python)\nframe #15: + 0x199499 (0x55e7aca00499 in /opt/conda/bin/python)\nframe #16: + 0x1995fa (0x55e7aca005fa in /opt/conda/bin/python)\nframe #17: PyNumber_Add + 0x41 (0x55e7ac9ac4b1 in /opt/conda/bin/python)\nframe #18: _PyEval_EvalFrameDefault + 0x1008 (0x55e7aca49098 in /opt/conda/bin/python)\nframe #19: + 0x18f742 (0x55e7ac9f6742 in /opt/conda/bin/python)\nframe #20: _PyObject_Call + 0x20a (0x55e7ac9aefaa in /opt/conda/bin/python)\nframe #21: _PyEval_EvalFrameDefault + 0x26e4 (0x55e7aca4a774 in /opt/conda/bin/python)\nframe #22: + 0x18f742 (0x55e7ac9f6742 in /opt/conda/bin/python)\nframe #23: _PyObject_Call + 0x20a (0x55e7ac9aefaa in /opt/conda/bin/python)\nframe #24: + 0xa0ab2a (0x7f1aad33ab2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f1aad338d6d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f1aad33bf05 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7f1aad33faa6 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7f1aa3563e1c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f1aad33bbe5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #30: + 0x453a313 (0x7f1aa355d313 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f1aa355df08 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f1aa3558597 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #33: + 0x456a202 (0x7f1aa358d202 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f1a958b77eb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #35: + 0xdbbf4 (0x7f1ac4a0dbf4 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #36: + 0x76db (0x7f1ae50616db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #37: clone + 0x3f (0x7f1ae4d8a61f in /lib/x86_64-linux-gnu/libc.so.6)\n') 2022-08-17T12:53:23.9366793Z Traceback (most recent call last): 2022-08-17T12:53:23.9367354Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 206, in _run_function 2022-08-17T12:53:23.9367822Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-08-17T12:53:23.9368441Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/distributed/rpc/rpc_test.py", line 5911, in _gpu_add_wrong_gpus 2022-08-17T12:53:23.9368842Z return x.cpu() + y.cuda() 2022-08-17T12:53:23.9369240Z RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! 2022-08-17T12:53:23.9369780Z Exception raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first): 2022-08-17T12:53:23.9370629Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f1a958c93bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:53:23.9371609Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f1a958c4d8e in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:53:23.9372498Z frame #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xc83 (0x7f1a9fed7f33 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9373300Z frame #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7f1a9fed940f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9374194Z frame #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7f1a9fedab42 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9375089Z frame #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7f1aa00a975e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9375805Z frame #6: + 0x2a72a1e (0x7f1a9857ba1e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-08-17T12:53:23.9376517Z frame #7: + 0x2a72b26 (0x7f1a9857bb26 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-08-17T12:53:23.9377352Z frame #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7f1aa0a99b18 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9378098Z frame #9: + 0x31dfc2a (0x7f1aa2202c2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9378744Z frame #10: + 0x31e0399 (0x7f1aa2203399 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9379482Z frame #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7f1aa0aceac2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9380255Z frame #12: + 0x32b427 (0x7f1aacc5b427 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9380904Z frame #13: + 0x32b766 (0x7f1aacc5b766 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9381367Z frame #14: + 0x1ddc68 (0x55e7aca44c68 in /opt/conda/bin/python) 2022-08-17T12:53:23.9381753Z frame #15: + 0x199499 (0x55e7aca00499 in /opt/conda/bin/python) 2022-08-17T12:53:23.9382153Z frame #16: + 0x1995fa (0x55e7aca005fa in /opt/conda/bin/python) 2022-08-17T12:53:23.9382550Z frame #17: PyNumber_Add + 0x41 (0x55e7ac9ac4b1 in /opt/conda/bin/python) 2022-08-17T12:53:23.9382945Z frame #18: _PyEval_EvalFrameDefault + 0x1008 (0x55e7aca49098 in /opt/conda/bin/python) 2022-08-17T12:53:23.9383772Z frame #19: + 0x18f742 (0x55e7ac9f6742 in /opt/conda/bin/python) 2022-08-17T12:53:23.9384184Z frame #20: _PyObject_Call + 0x20a (0x55e7ac9aefaa in /opt/conda/bin/python) 2022-08-17T12:53:23.9384598Z frame #21: _PyEval_EvalFrameDefault + 0x26e4 (0x55e7aca4a774 in /opt/conda/bin/python) 2022-08-17T12:53:23.9384998Z frame #22: + 0x18f742 (0x55e7ac9f6742 in /opt/conda/bin/python) 2022-08-17T12:53:23.9385389Z frame #23: _PyObject_Call + 0x20a (0x55e7ac9aefaa in /opt/conda/bin/python) 2022-08-17T12:53:23.9385998Z frame #24: + 0xa0ab2a (0x7f1aad33ab2a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9386795Z frame #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f1aad338d6d in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9387787Z frame #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f1aad33bf05 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9388904Z frame #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7f1aad33faa6 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9390119Z frame #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7f1aa3563e1c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9391384Z frame #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f1aad33bbe5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-08-17T12:53:23.9392267Z frame #30: + 0x453a313 (0x7f1aa355d313 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9393280Z frame #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f1aa355df08 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9394341Z frame #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f1aa3558597 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9395117Z frame #33: + 0x456a202 (0x7f1aa358d202 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-08-17T12:53:23.9395796Z frame #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f1a958b77eb in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-08-17T12:53:23.9396389Z frame #35: + 0xdbbf4 (0x7f1ac4a0dbf4 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-08-17T12:53:23.9396930Z frame #36: + 0x76db (0x7f1ae50616db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-08-17T12:53:23.9397437Z frame #37: clone + 0x3f (0x7f1ae4d8a61f in /lib/x86_64-linux-gnu/libc.so.6) 2022-08-17T12:53:23.9397665Z 2022-08-17T12:53:23.9397686Z 2022-08-17T12:53:24.6005933Z ok (7.345s) 2022-08-17T12:53:24.6006229Z 2022-08-17T12:53:24.6006631Z ---------------------------------------------------------------------- 2022-08-17T12:53:24.6007175Z Ran 1 test in 7.345s 2022-08-17T12:53:24.6007349Z 2022-08-17T12:53:24.6007427Z OK 2022-08-17T12:53:24.6007566Z 2022-08-17T12:53:24.6007707Z Generating XML reports... 2022-08-17T12:53:24.6044420Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125317.xml 2022-08-17T12:53:26.3778634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:26.3779367Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:26.3780384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:26.3780867Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:26.5507930Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4bhg45l9 2022-08-17T12:53:26.5510869Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4bhg45l9/_remote_module_non_scriptable.py 2022-08-17T12:53:26.9790659Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:53:26.9806107Z 2022-08-17T12:53:26.9806446Z Running tests... 2022-08-17T12:53:26.9806906Z ---------------------------------------------------------------------- 2022-08-17T12:53:28.4930724Z test_devices_option_mismatch (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:53:28.5108621Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23289 2022-08-17T12:53:28.5115287Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23290 2022-08-17T12:53:28.5121328Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 23291 2022-08-17T12:53:28.5127536Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 23292 2022-08-17T12:53:29.9077263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:29.9078132Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:29.9078735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:29.9079197Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:29.9080073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:29.9080578Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:29.9081581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:29.9082071Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:29.9082665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:29.9083125Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:29.9085883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:29.9086496Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:29.9290873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:29.9291342Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:29.9294112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:29.9294573Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:30.0877641Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwwyskp4b 2022-08-17T12:53:30.0878884Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwwyskp4b/_remote_module_non_scriptable.py 2022-08-17T12:53:30.0885744Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxmrk4kat 2022-08-17T12:53:30.0888309Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxmrk4kat/_remote_module_non_scriptable.py 2022-08-17T12:53:30.0891011Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpplhk4q1i 2022-08-17T12:53:30.0893789Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpplhk4q1i/_remote_module_non_scriptable.py 2022-08-17T12:53:30.1037610Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8illphc1 2022-08-17T12:53:30.1039802Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8illphc1/_remote_module_non_scriptable.py 2022-08-17T12:53:30.5176668Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:53:30.5187520Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:53:30.5211601Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:53:30.5365639Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:53:30.6198865Z fi_getinfo: -61 2022-08-17T12:53:30.6207459Z fi_getinfo: -61 2022-08-17T12:53:30.6232906Z fi_getinfo: -61 2022-08-17T12:53:30.6387831Z fi_getinfo: -61 2022-08-17T12:53:31.1209309Z ok (4.140s) 2022-08-17T12:53:31.1209513Z 2022-08-17T12:53:31.1209938Z ---------------------------------------------------------------------- 2022-08-17T12:53:31.1210319Z Ran 1 test in 4.140s 2022-08-17T12:53:31.1210467Z 2022-08-17T12:53:31.1210563Z OK 2022-08-17T12:53:31.1210701Z 2022-08-17T12:53:31.1210839Z Generating XML reports... 2022-08-17T12:53:31.1246809Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125326.xml 2022-08-17T12:53:32.9070006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:32.9070564Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:32.9071791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:32.9072303Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:33.0816911Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9x_gp84i 2022-08-17T12:53:33.0819179Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9x_gp84i/_remote_module_non_scriptable.py 2022-08-17T12:53:33.5032669Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:53:33.5048928Z 2022-08-17T12:53:33.5049469Z Running tests... 2022-08-17T12:53:33.5049944Z ---------------------------------------------------------------------- 2022-08-17T12:53:35.0246737Z test_devices_option_mismatch_reverse (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:53:35.0431060Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23476 2022-08-17T12:53:35.0437116Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23477 2022-08-17T12:53:35.0443489Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 23478 2022-08-17T12:53:35.0449809Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 23479 2022-08-17T12:53:36.4528097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:36.4528611Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:36.4529386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:36.4529851Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:36.4580828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:36.4581300Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:36.4584224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:36.4584698Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:36.5286255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:36.5286736Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:36.5288996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:36.5289487Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:36.5584464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:36.5584925Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:36.5588594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:36.5589081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:36.6195656Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzguk0ijb 2022-08-17T12:53:36.6197781Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzguk0ijb/_remote_module_non_scriptable.py 2022-08-17T12:53:36.6248485Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp34p22lcn 2022-08-17T12:53:36.6251282Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp34p22lcn/_remote_module_non_scriptable.py 2022-08-17T12:53:36.6992670Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3fs5fd8x 2022-08-17T12:53:36.6993875Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3fs5fd8x/_remote_module_non_scriptable.py 2022-08-17T12:53:36.7364564Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgr_28p80 2022-08-17T12:53:36.7366743Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgr_28p80/_remote_module_non_scriptable.py 2022-08-17T12:53:37.0395751Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:53:37.0406249Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:53:37.1139054Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:53:37.1420086Z fi_getinfo: -61 2022-08-17T12:53:37.1429668Z fi_getinfo: -61 2022-08-17T12:53:37.1762629Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:53:37.2162266Z fi_getinfo: -61 2022-08-17T12:53:37.2786132Z fi_getinfo: -61 2022-08-17T12:53:37.7528484Z ok (4.248s) 2022-08-17T12:53:37.7528786Z 2022-08-17T12:53:37.7529191Z ---------------------------------------------------------------------- 2022-08-17T12:53:37.7529819Z Ran 1 test in 4.248s 2022-08-17T12:53:37.7529986Z 2022-08-17T12:53:37.7530078Z OK 2022-08-17T12:53:37.7530211Z 2022-08-17T12:53:37.7530347Z Generating XML reports... 2022-08-17T12:53:37.7566604Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125333.xml 2022-08-17T12:53:39.4968608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:39.4969102Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:39.4969930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:39.4970400Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:39.6721946Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqbi4k_fe 2022-08-17T12:53:39.6723851Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqbi4k_fe/_remote_module_non_scriptable.py 2022-08-17T12:53:40.0954905Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:53:40.0971108Z 2022-08-17T12:53:40.0971354Z Running tests... 2022-08-17T12:53:40.0971784Z ---------------------------------------------------------------------- 2022-08-17T12:53:41.6163838Z test_owner_rref_forward_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:53:41.6347345Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23663 2022-08-17T12:53:41.6353592Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23664 2022-08-17T12:53:41.6359779Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 23665 2022-08-17T12:53:41.6366755Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 23666 2022-08-17T12:53:43.0946421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:43.0947363Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:43.0948505Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:43.0949418Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:43.0950576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:43.0951474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:43.0952644Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:43.0953584Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:43.1040722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:43.1041571Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:43.1043274Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:43.1044158Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:43.1279614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:43.1280549Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:43.1282279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:43.1283237Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:43.2702612Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcfw23ntu 2022-08-17T12:53:43.2704388Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcfw23ntu/_remote_module_non_scriptable.py 2022-08-17T12:53:43.2705459Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0w8lis0i 2022-08-17T12:53:43.2708440Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0w8lis0i/_remote_module_non_scriptable.py 2022-08-17T12:53:43.2759887Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqvhxbw64 2022-08-17T12:53:43.2762792Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqvhxbw64/_remote_module_non_scriptable.py 2022-08-17T12:53:43.3029316Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpeju2a81m 2022-08-17T12:53:43.3031458Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpeju2a81m/_remote_module_non_scriptable.py 2022-08-17T12:53:43.6981463Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:53:43.6986343Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:53:43.7044931Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:53:43.7371224Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:53:43.8010512Z fi_getinfo: -61 2022-08-17T12:53:50.1569745Z ok (10.059s) 2022-08-17T12:53:50.1569955Z 2022-08-17T12:53:50.1570371Z ---------------------------------------------------------------------- 2022-08-17T12:53:50.1570717Z Ran 1 test in 10.060s 2022-08-17T12:53:50.1570883Z 2022-08-17T12:53:50.1570961Z OK 2022-08-17T12:53:50.1571099Z 2022-08-17T12:53:50.1571237Z Generating XML reports... 2022-08-17T12:53:50.1607670Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125340.xml 2022-08-17T12:53:51.9092650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:51.9093370Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:51.9094364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:51.9094828Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:52.0773216Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj4iq9cb2 2022-08-17T12:53:52.0776098Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj4iq9cb2/_remote_module_non_scriptable.py 2022-08-17T12:53:52.4884000Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:53:52.4900036Z 2022-08-17T12:53:52.4900234Z Running tests... 2022-08-17T12:53:52.4900667Z ---------------------------------------------------------------------- 2022-08-17T12:53:53.9706492Z test_owner_rref_forward_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:53:53.9886887Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23878 2022-08-17T12:53:53.9893360Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23879 2022-08-17T12:53:53.9899528Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 23880 2022-08-17T12:53:53.9906172Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 23881 2022-08-17T12:53:55.4255131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:55.4256074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:55.4257268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:55.4258168Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:55.4300364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:55.4301295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:55.4302923Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:55.4304340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:55.4555680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:55.4556610Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:55.4558353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:55.4559271Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:55.4608263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:53:55.4609166Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:53:55.4611926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:53:55.4612866Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:53:55.5953843Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdbnsd3jb 2022-08-17T12:53:55.5954989Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdbnsd3jb/_remote_module_non_scriptable.py 2022-08-17T12:53:55.6003989Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7r44zrrc 2022-08-17T12:53:55.6006080Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7r44zrrc/_remote_module_non_scriptable.py 2022-08-17T12:53:55.6290803Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbh0jmuy2 2022-08-17T12:53:55.6292761Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbh0jmuy2/_remote_module_non_scriptable.py 2022-08-17T12:53:55.6302583Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplw7trrd8 2022-08-17T12:53:55.6305345Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplw7trrd8/_remote_module_non_scriptable.py 2022-08-17T12:53:56.0219063Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:53:56.0261183Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:53:56.0518186Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:53:56.0610547Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:53:56.1241919Z fi_getinfo: -61 2022-08-17T12:54:03.9141642Z ok (11.424s) 2022-08-17T12:54:03.9141880Z 2022-08-17T12:54:03.9142302Z ---------------------------------------------------------------------- 2022-08-17T12:54:03.9142651Z Ran 1 test in 11.424s 2022-08-17T12:54:03.9142819Z 2022-08-17T12:54:03.9142916Z OK 2022-08-17T12:54:03.9143863Z 2022-08-17T12:54:03.9144048Z Generating XML reports... 2022-08-17T12:54:03.9179439Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125352.xml 2022-08-17T12:54:05.6929739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:05.6930274Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:05.6930862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:05.6931585Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:05.8662364Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvynyjbkh 2022-08-17T12:54:05.8665046Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvynyjbkh/_remote_module_non_scriptable.py 2022-08-17T12:54:06.2904954Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:54:06.2920602Z 2022-08-17T12:54:06.2921069Z Running tests... 2022-08-17T12:54:06.2921554Z ---------------------------------------------------------------------- 2022-08-17T12:54:07.8089887Z test_owner_rref_forward_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:54:07.8265764Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24094 2022-08-17T12:54:07.8272281Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24095 2022-08-17T12:54:07.8278328Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 24096 2022-08-17T12:54:07.8284523Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 24097 2022-08-17T12:54:09.2362691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:09.2363208Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:09.2364227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:09.2364756Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:09.2384964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:09.2385427Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:09.2387837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:09.2388312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:09.2442176Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:09.2442643Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:09.2446317Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:09.2446779Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:09.2613912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:09.2614375Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:09.2616969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:09.2617436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:09.4082760Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmply3a4zgi 2022-08-17T12:54:09.4083731Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmply3a4zgi/_remote_module_non_scriptable.py 2022-08-17T12:54:09.4104971Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcbuhd48n 2022-08-17T12:54:09.4107984Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcbuhd48n/_remote_module_non_scriptable.py 2022-08-17T12:54:09.4147057Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcp57j2fs 2022-08-17T12:54:09.4150182Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcp57j2fs/_remote_module_non_scriptable.py 2022-08-17T12:54:09.4363722Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpct0v2a1z 2022-08-17T12:54:09.4366243Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpct0v2a1z/_remote_module_non_scriptable.py 2022-08-17T12:54:09.8396602Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:54:09.8406136Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:54:09.8527372Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:54:09.8694985Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:54:09.9417631Z fi_getinfo: -61 2022-08-17T12:54:18.0520107Z ok (11.760s) 2022-08-17T12:54:18.0520315Z 2022-08-17T12:54:18.0520683Z ---------------------------------------------------------------------- 2022-08-17T12:54:18.0521049Z Ran 1 test in 11.760s 2022-08-17T12:54:18.0521217Z 2022-08-17T12:54:18.0521321Z OK 2022-08-17T12:54:18.0521460Z 2022-08-17T12:54:18.0521595Z Generating XML reports... 2022-08-17T12:54:18.0557399Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125406.xml 2022-08-17T12:54:19.7918389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:19.7919287Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:19.7920356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:19.7920869Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:19.9588178Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyw_693a9 2022-08-17T12:54:19.9590983Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyw_693a9/_remote_module_non_scriptable.py 2022-08-17T12:54:20.3722449Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:54:20.3737741Z 2022-08-17T12:54:20.3737889Z Running tests... 2022-08-17T12:54:20.3738352Z ---------------------------------------------------------------------- 2022-08-17T12:54:21.8512627Z test_owner_rref_forward_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:54:21.8689215Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24310 2022-08-17T12:54:21.8695963Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24311 2022-08-17T12:54:21.8702238Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 24312 2022-08-17T12:54:21.8708861Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 24313 2022-08-17T12:54:23.2696108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:23.2697084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:23.2698281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:23.2699253Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:23.3166582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:23.3167578Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:23.3168774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:23.3169705Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:23.3187762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:23.3188629Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:23.3190717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:23.3191810Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:23.3543949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:23.3545168Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:23.3546860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:23.3547804Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:23.4390586Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4nsbafol 2022-08-17T12:54:23.4391972Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4nsbafol/_remote_module_non_scriptable.py 2022-08-17T12:54:23.4884389Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxd79mw0h 2022-08-17T12:54:23.4885766Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxd79mw0h/_remote_module_non_scriptable.py 2022-08-17T12:54:23.4899739Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpeq373foy 2022-08-17T12:54:23.4901880Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpeq373foy/_remote_module_non_scriptable.py 2022-08-17T12:54:23.5281766Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7v_h7gpc 2022-08-17T12:54:23.5283241Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7v_h7gpc/_remote_module_non_scriptable.py 2022-08-17T12:54:23.8572424Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:54:23.9136661Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:54:23.9163543Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:54:23.9568357Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:54:24.0162291Z fi_getinfo: -61 2022-08-17T12:54:30.2908285Z ok (9.917s) 2022-08-17T12:54:30.2908504Z 2022-08-17T12:54:30.2908898Z ---------------------------------------------------------------------- 2022-08-17T12:54:30.2909243Z Ran 1 test in 9.917s 2022-08-17T12:54:30.2909414Z 2022-08-17T12:54:30.2909512Z OK 2022-08-17T12:54:30.2910730Z 2022-08-17T12:54:30.2912177Z Generating XML reports... 2022-08-17T12:54:30.2945965Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125420.xml 2022-08-17T12:54:32.0690171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:32.0691166Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:32.0692390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:32.0693311Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:32.2451507Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpayvwx22w 2022-08-17T12:54:32.2454045Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpayvwx22w/_remote_module_non_scriptable.py 2022-08-17T12:54:32.6692435Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:54:32.6710411Z 2022-08-17T12:54:32.6710914Z Running tests... 2022-08-17T12:54:32.6711446Z ---------------------------------------------------------------------- 2022-08-17T12:54:34.2067035Z test_rref_as_arg_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:54:34.2253881Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24525 2022-08-17T12:54:34.2260734Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24526 2022-08-17T12:54:34.2268004Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 24527 2022-08-17T12:54:34.2274407Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 24528 2022-08-17T12:54:35.6107836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:35.6108337Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:35.6109391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:35.6109875Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:35.6153815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:35.6154263Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:35.6157055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:35.6157541Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:35.6741806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:35.6742290Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:35.6744850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:35.6745333Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:35.6821231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:35.6821695Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:35.6824910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:35.6825405Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:35.7781734Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjkr6ibr_ 2022-08-17T12:54:35.7783134Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjkr6ibr_/_remote_module_non_scriptable.py 2022-08-17T12:54:35.7829678Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4mzrgveh 2022-08-17T12:54:35.7832253Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4mzrgveh/_remote_module_non_scriptable.py 2022-08-17T12:54:35.8492530Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3lua9y34 2022-08-17T12:54:35.8493960Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3lua9y34/_remote_module_non_scriptable.py 2022-08-17T12:54:35.8501847Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3xksyetx 2022-08-17T12:54:35.8504792Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3xksyetx/_remote_module_non_scriptable.py 2022-08-17T12:54:36.2036199Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:54:36.2040224Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:54:36.2709579Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:54:36.2860803Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:54:36.3057346Z fi_getinfo: -61 2022-08-17T12:54:36.3061184Z fi_getinfo: -61 2022-08-17T12:54:36.3728905Z fi_getinfo: -61 2022-08-17T12:54:36.3881257Z fi_getinfo: -61 2022-08-17T12:54:50.1654870Z ok (17.494s) 2022-08-17T12:54:50.1655220Z 2022-08-17T12:54:50.1655660Z ---------------------------------------------------------------------- 2022-08-17T12:54:50.1656013Z Ran 1 test in 17.494s 2022-08-17T12:54:50.1656518Z 2022-08-17T12:54:50.1656616Z OK 2022-08-17T12:54:50.1656758Z 2022-08-17T12:54:50.1656898Z Generating XML reports... 2022-08-17T12:54:50.1692759Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125432.xml 2022-08-17T12:54:51.9352325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:51.9352823Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:51.9353918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:51.9354399Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:52.1084568Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd9elnllj 2022-08-17T12:54:52.1086871Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd9elnllj/_remote_module_non_scriptable.py 2022-08-17T12:54:52.5385654Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:54:52.5402293Z 2022-08-17T12:54:52.5402729Z Running tests... 2022-08-17T12:54:52.5403422Z ---------------------------------------------------------------------- 2022-08-17T12:54:54.0545468Z test_rref_as_arg_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:54:54.0731767Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24872 2022-08-17T12:54:54.0737804Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24873 2022-08-17T12:54:54.0743984Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 24874 2022-08-17T12:54:54.0750744Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 24875 2022-08-17T12:54:55.4715685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:55.4716188Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:55.4716964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:55.4717442Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:55.4718029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:55.4718459Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:55.4720591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:55.4721069Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:55.4927081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:55.4927549Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:55.4930245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:55.4931012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:55.5140545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:54:55.5140991Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:54:55.5143939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:54:55.5144658Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:54:55.6476451Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfc_or_g4 2022-08-17T12:54:55.6477609Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfc_or_g4/_remote_module_non_scriptable.py 2022-08-17T12:54:55.6488413Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3di7bmle 2022-08-17T12:54:55.6490998Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3di7bmle/_remote_module_non_scriptable.py 2022-08-17T12:54:55.6678188Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4sh25_yu 2022-08-17T12:54:55.6680926Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4sh25_yu/_remote_module_non_scriptable.py 2022-08-17T12:54:55.6830731Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_fd22zn8 2022-08-17T12:54:55.6833176Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_fd22zn8/_remote_module_non_scriptable.py 2022-08-17T12:54:56.0809572Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:54:56.0835019Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:54:56.1078816Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:54:56.1088160Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:54:56.1911075Z fi_getinfo: -61 2022-08-17T12:54:56.1914381Z fi_getinfo: -61 2022-08-17T12:54:56.2098058Z fi_getinfo: -61 2022-08-17T12:54:56.2107311Z fi_getinfo: -61 2022-08-17T12:55:11.9175848Z ok (19.377s) 2022-08-17T12:55:11.9176080Z 2022-08-17T12:55:11.9176499Z ---------------------------------------------------------------------- 2022-08-17T12:55:11.9176846Z Ran 1 test in 19.377s 2022-08-17T12:55:11.9177012Z 2022-08-17T12:55:11.9177106Z OK 2022-08-17T12:55:11.9177223Z 2022-08-17T12:55:11.9177357Z Generating XML reports... 2022-08-17T12:55:11.9216110Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125452.xml 2022-08-17T12:55:13.6884726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:55:13.6885429Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:55:13.6886428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:55:13.6886897Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:55:13.8631561Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx3zjtrfc 2022-08-17T12:55:13.8637780Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx3zjtrfc/_remote_module_non_scriptable.py 2022-08-17T12:55:14.2848873Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:55:14.2864935Z 2022-08-17T12:55:14.2865254Z Running tests... 2022-08-17T12:55:14.2865688Z ---------------------------------------------------------------------- 2022-08-17T12:55:15.7996364Z test_rref_as_arg_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:55:15.8159079Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/81962 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.529s) 2022-08-17T12:55:15.8159722Z 2022-08-17T12:55:15.8160013Z ---------------------------------------------------------------------- 2022-08-17T12:55:15.8160330Z Ran 1 test in 1.529s 2022-08-17T12:55:15.8160495Z 2022-08-17T12:55:15.8160606Z OK (skipped=1) 2022-08-17T12:55:15.8160763Z 2022-08-17T12:55:15.8160891Z Generating XML reports... 2022-08-17T12:55:15.8194502Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125514.xml 2022-08-17T12:55:17.5816311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:55:17.5816908Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:55:17.5817546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:55:17.5818019Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:55:17.7563040Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7ffh67p0 2022-08-17T12:55:17.7565181Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7ffh67p0/_remote_module_non_scriptable.py 2022-08-17T12:55:18.1850555Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:55:18.1867494Z 2022-08-17T12:55:18.1867878Z Running tests... 2022-08-17T12:55:18.1868349Z ---------------------------------------------------------------------- 2022-08-17T12:55:19.6900048Z test_rref_as_arg_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:55:19.7079558Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25259 2022-08-17T12:55:19.7085787Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25260 2022-08-17T12:55:19.7092036Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 25261 2022-08-17T12:55:19.7098134Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 25262 2022-08-17T12:55:21.0845143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:55:21.0845646Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:55:21.0846466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:55:21.0846963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:55:21.1016619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:55:21.1017073Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:55:21.1019941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:55:21.1020419Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:55:21.1147324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:55:21.1147762Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:55:21.1150502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:55:21.1150972Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:55:21.1323766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:55:21.1324501Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:55:21.1326952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:55:21.1327426Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:55:21.2531762Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4pujl20h 2022-08-17T12:55:21.2532731Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4pujl20h/_remote_module_non_scriptable.py 2022-08-17T12:55:21.2692090Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1ph61pq6 2022-08-17T12:55:21.2694784Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1ph61pq6/_remote_module_non_scriptable.py 2022-08-17T12:55:21.2836066Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgv0i1kjv 2022-08-17T12:55:21.2838981Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgv0i1kjv/_remote_module_non_scriptable.py 2022-08-17T12:55:21.3055045Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_1bkf6zf 2022-08-17T12:55:21.3056986Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_1bkf6zf/_remote_module_non_scriptable.py 2022-08-17T12:55:21.6804564Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:55:21.6897191Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:55:21.7072505Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:55:21.7343194Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:55:21.7898589Z fi_getinfo: -61 2022-08-17T12:55:21.7916190Z fi_getinfo: -61 2022-08-17T12:55:21.8089638Z fi_getinfo: -61 2022-08-17T12:55:21.8361212Z fi_getinfo: -61 2022-08-17T12:55:37.9602649Z ok (19.773s) 2022-08-17T12:55:37.9602860Z 2022-08-17T12:55:37.9605873Z ---------------------------------------------------------------------- 2022-08-17T12:55:37.9606274Z Ran 1 test in 19.773s 2022-08-17T12:55:37.9606424Z 2022-08-17T12:55:37.9606521Z OK 2022-08-17T12:55:37.9606660Z 2022-08-17T12:55:37.9606799Z Generating XML reports... 2022-08-17T12:55:37.9640306Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125518.xml 2022-08-17T12:55:39.7283565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:55:39.7284053Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:55:39.7285000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:55:39.7285498Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:55:39.9027167Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprcd_lctu 2022-08-17T12:55:39.9029775Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprcd_lctu/_remote_module_non_scriptable.py 2022-08-17T12:55:40.3282585Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:55:40.3298402Z 2022-08-17T12:55:40.3298554Z Running tests... 2022-08-17T12:55:40.3298997Z ---------------------------------------------------------------------- 2022-08-17T12:55:41.8345512Z test_rref_as_arg_synchronization5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:55:41.8524660Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25612 2022-08-17T12:55:41.8531366Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25613 2022-08-17T12:55:41.8537518Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 25614 2022-08-17T12:55:41.8544349Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 25615 2022-08-17T12:55:43.2560924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:55:43.2561580Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:55:43.2562273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:55:43.2562732Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:55:43.2563508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:55:43.2564146Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:55:43.2566231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:55:43.2566706Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:55:43.2575355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:55:43.2575922Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:55:43.2578871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:55:43.2579367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:55:43.2762480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:55:43.2762927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:55:43.2765920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:55:43.2766412Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:55:43.4380418Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfgk6ba6w 2022-08-17T12:55:43.4382133Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfgk6ba6w/_remote_module_non_scriptable.py 2022-08-17T12:55:43.4382902Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5dnph9l5 2022-08-17T12:55:43.4384333Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsfuvyito 2022-08-17T12:55:43.4385109Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5dnph9l5/_remote_module_non_scriptable.py 2022-08-17T12:55:43.4387118Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsfuvyito/_remote_module_non_scriptable.py 2022-08-17T12:55:43.4500603Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4sdnbe0r 2022-08-17T12:55:43.4503720Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4sdnbe0r/_remote_module_non_scriptable.py 2022-08-17T12:55:43.8693201Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:55:43.8714149Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:55:43.8725710Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:55:43.8878688Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:55:43.9729114Z fi_getinfo: -61 2022-08-17T12:55:43.9732784Z fi_getinfo: -61 2022-08-17T12:55:43.9743902Z fi_getinfo: -61 2022-08-17T12:55:43.9897165Z fi_getinfo: -61 2022-08-17T12:55:57.4912647Z ok (17.161s) 2022-08-17T12:55:57.4913042Z 2022-08-17T12:55:57.4913663Z ---------------------------------------------------------------------- 2022-08-17T12:55:57.4914310Z Ran 1 test in 17.161s 2022-08-17T12:55:57.4914570Z 2022-08-17T12:55:57.4914732Z OK 2022-08-17T12:55:57.4915086Z 2022-08-17T12:55:57.4915341Z Generating XML reports... 2022-08-17T12:55:57.4951805Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125540.xml 2022-08-17T12:55:59.2723264Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:55:59.2723770Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:55:59.2724697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:55:59.2725181Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:55:59.4477493Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpspx_c1p8 2022-08-17T12:55:59.4480046Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpspx_c1p8/_remote_module_non_scriptable.py 2022-08-17T12:55:59.8766486Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:55:59.8783382Z 2022-08-17T12:55:59.8783810Z Running tests... 2022-08-17T12:55:59.8784283Z ---------------------------------------------------------------------- 2022-08-17T12:56:01.3896230Z test_rref_forward_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:56:01.4081975Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25959 2022-08-17T12:56:01.4088427Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25960 2022-08-17T12:56:01.4094979Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 25961 2022-08-17T12:56:01.4101570Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 25962 2022-08-17T12:56:02.8012698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:02.8013240Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:02.8014242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:02.8014758Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:02.8093855Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:02.8094333Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:02.8097257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:02.8097737Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:02.8545110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:02.8545597Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:02.8547939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:02.8548420Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:02.9084145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:02.9084796Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:02.9085682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:02.9086290Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:02.9708224Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuqm1fs_9 2022-08-17T12:56:02.9709662Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuqm1fs_9/_remote_module_non_scriptable.py 2022-08-17T12:56:02.9802199Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmurjefm5 2022-08-17T12:56:02.9805248Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmurjefm5/_remote_module_non_scriptable.py 2022-08-17T12:56:03.0246332Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5om4g_u5 2022-08-17T12:56:03.0247289Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5om4g_u5/_remote_module_non_scriptable.py 2022-08-17T12:56:03.0844845Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp99iv62ix 2022-08-17T12:56:03.0845711Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp99iv62ix/_remote_module_non_scriptable.py 2022-08-17T12:56:03.3921903Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:56:03.4000662Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:56:03.4358509Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:56:03.4942342Z fi_getinfo: -61 2022-08-17T12:56:03.5019935Z fi_getinfo: -61 2022-08-17T12:56:03.5068987Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:56:03.5378312Z fi_getinfo: -61 2022-08-17T12:56:03.6089429Z fi_getinfo: -61 2022-08-17T12:56:15.7436138Z ok (15.865s) 2022-08-17T12:56:15.7436340Z 2022-08-17T12:56:15.7436743Z ---------------------------------------------------------------------- 2022-08-17T12:56:15.7437105Z Ran 1 test in 15.865s 2022-08-17T12:56:15.7437271Z 2022-08-17T12:56:15.7437369Z OK 2022-08-17T12:56:15.7437505Z 2022-08-17T12:56:15.7437623Z Generating XML reports... 2022-08-17T12:56:15.7474531Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125559.xml 2022-08-17T12:56:17.5297766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:17.5298260Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:17.5299425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:17.5299899Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:17.7036704Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp208uzs9c 2022-08-17T12:56:17.7039249Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp208uzs9c/_remote_module_non_scriptable.py 2022-08-17T12:56:18.1300503Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:56:18.1316839Z 2022-08-17T12:56:18.1317092Z Running tests... 2022-08-17T12:56:18.1317705Z ---------------------------------------------------------------------- 2022-08-17T12:56:19.6444294Z test_rref_forward_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:56:19.6623527Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26305 2022-08-17T12:56:19.6630730Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26306 2022-08-17T12:56:19.6636917Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 26307 2022-08-17T12:56:19.6643511Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 26308 2022-08-17T12:56:21.0690422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:21.0690939Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:21.0691944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:21.0692469Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:21.0693070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:21.0693871Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:21.0694537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:21.0695021Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:21.0898275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:21.0898755Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:21.0901614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:21.0902325Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:21.1271942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:21.1272433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:21.1274777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:21.1275244Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:21.2431326Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0ikdznew 2022-08-17T12:56:21.2432514Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0ikdznew/_remote_module_non_scriptable.py 2022-08-17T12:56:21.2438650Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbwapwtvx 2022-08-17T12:56:21.2441374Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbwapwtvx/_remote_module_non_scriptable.py 2022-08-17T12:56:21.2646439Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_boubtkq 2022-08-17T12:56:21.2649166Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_boubtkq/_remote_module_non_scriptable.py 2022-08-17T12:56:21.2984872Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsh60fi4n 2022-08-17T12:56:21.2986991Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsh60fi4n/_remote_module_non_scriptable.py 2022-08-17T12:56:21.6696051Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:56:21.6711513Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:56:21.6989708Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:56:21.7180560Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:56:21.7717554Z fi_getinfo: -61 2022-08-17T12:56:21.7731186Z fi_getinfo: -61 2022-08-17T12:56:21.8009095Z fi_getinfo: -61 2022-08-17T12:56:21.8198263Z fi_getinfo: -61 2022-08-17T12:56:34.2987243Z ok (16.167s) 2022-08-17T12:56:34.2987549Z 2022-08-17T12:56:34.2987987Z ---------------------------------------------------------------------- 2022-08-17T12:56:34.2988332Z Ran 1 test in 16.167s 2022-08-17T12:56:34.2988496Z 2022-08-17T12:56:34.2988572Z OK 2022-08-17T12:56:34.2988709Z 2022-08-17T12:56:34.2988850Z Generating XML reports... 2022-08-17T12:56:34.3025569Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125618.xml 2022-08-17T12:56:36.0692104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:36.0692596Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:36.0693372Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:36.0693884Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:36.2418983Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprgy44i3b 2022-08-17T12:56:36.2421243Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprgy44i3b/_remote_module_non_scriptable.py 2022-08-17T12:56:36.6706758Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:56:36.6723057Z 2022-08-17T12:56:36.6723513Z Running tests... 2022-08-17T12:56:36.6724014Z ---------------------------------------------------------------------- 2022-08-17T12:56:38.1828669Z test_rref_forward_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:56:38.2005365Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26654 2022-08-17T12:56:38.2012057Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26655 2022-08-17T12:56:38.2018744Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 26656 2022-08-17T12:56:38.2025578Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 26657 2022-08-17T12:56:39.6056796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:39.6057316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:39.6057913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:39.6058378Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:39.6185605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:39.6186076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:39.6189089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:39.6189560Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:39.6557604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:39.6558071Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:39.6560759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:39.6561219Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:39.6591551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:39.6592007Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:39.6594841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:39.6595297Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:39.7743280Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppnn93ftl 2022-08-17T12:56:39.7744390Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppnn93ftl/_remote_module_non_scriptable.py 2022-08-17T12:56:39.7911596Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2fn2f4bn 2022-08-17T12:56:39.7914197Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2fn2f4bn/_remote_module_non_scriptable.py 2022-08-17T12:56:39.8242047Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_htjyvwa 2022-08-17T12:56:39.8243954Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_htjyvwa/_remote_module_non_scriptable.py 2022-08-17T12:56:39.8281605Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplz7pdmdr 2022-08-17T12:56:39.8284409Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplz7pdmdr/_remote_module_non_scriptable.py 2022-08-17T12:56:40.2027028Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:56:40.2352722Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:56:40.2499360Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:56:40.2533426Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:56:40.3151954Z fi_getinfo: -61 2022-08-17T12:56:40.3372859Z fi_getinfo: -61 2022-08-17T12:56:40.3519966Z fi_getinfo: -61 2022-08-17T12:56:40.3554440Z fi_getinfo: -61 2022-08-17T12:56:52.8373310Z ok (16.165s) 2022-08-17T12:56:52.8373680Z 2022-08-17T12:56:52.8374100Z ---------------------------------------------------------------------- 2022-08-17T12:56:52.8374827Z Ran 1 test in 16.165s 2022-08-17T12:56:52.8375006Z 2022-08-17T12:56:52.8375103Z OK 2022-08-17T12:56:52.8375227Z 2022-08-17T12:56:52.8375367Z Generating XML reports... 2022-08-17T12:56:52.8410800Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125636.xml 2022-08-17T12:56:54.6264939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:54.6265446Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:54.6266508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:54.6267002Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:54.7997898Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf99it2ub 2022-08-17T12:56:54.8000165Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf99it2ub/_remote_module_non_scriptable.py 2022-08-17T12:56:55.2248991Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:56:55.2264942Z 2022-08-17T12:56:55.2265376Z Running tests... 2022-08-17T12:56:55.2265856Z ---------------------------------------------------------------------- 2022-08-17T12:56:56.7448874Z test_rref_forward_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:56:56.7624819Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27003 2022-08-17T12:56:56.7631441Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27004 2022-08-17T12:56:56.7637792Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 27005 2022-08-17T12:56:56.7643695Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 27006 2022-08-17T12:56:58.1538881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:58.1539388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:58.1540194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:58.1540680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:58.1799272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:58.1799731Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:58.1802203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:58.1802680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:58.1985207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:58.1985692Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:58.1988510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:58.1989037Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:58.2505694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:56:58.2506145Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:56:58.2508222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:56:58.2508693Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:56:58.3209306Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7fo7n6am 2022-08-17T12:56:58.3210843Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7fo7n6am/_remote_module_non_scriptable.py 2022-08-17T12:56:58.3538837Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjgamindc 2022-08-17T12:56:58.3541440Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjgamindc/_remote_module_non_scriptable.py 2022-08-17T12:56:58.3661720Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa1ajfpxg 2022-08-17T12:56:58.3664395Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa1ajfpxg/_remote_module_non_scriptable.py 2022-08-17T12:56:58.4195791Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr5sba6n1 2022-08-17T12:56:58.4196609Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr5sba6n1/_remote_module_non_scriptable.py 2022-08-17T12:56:58.7410862Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:56:58.7848730Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:56:58.7886742Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:56:58.8338400Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:56:58.8432255Z fi_getinfo: -61 2022-08-17T12:56:58.8870698Z fi_getinfo: -61 2022-08-17T12:56:58.8906282Z fi_getinfo: -61 2022-08-17T12:56:58.9358534Z fi_getinfo: -61 2022-08-17T12:57:11.2984976Z ok (16.072s) 2022-08-17T12:57:11.2985171Z 2022-08-17T12:57:11.2985586Z ---------------------------------------------------------------------- 2022-08-17T12:57:11.2985931Z Ran 1 test in 16.072s 2022-08-17T12:57:11.2986097Z 2022-08-17T12:57:11.2986191Z OK 2022-08-17T12:57:11.2986325Z 2022-08-17T12:57:11.2986444Z Generating XML reports... 2022-08-17T12:57:11.3022037Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125655.xml 2022-08-17T12:57:13.0551200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:57:13.0552001Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:57:13.0552836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:57:13.0553319Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:57:13.2284410Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp64rttnio 2022-08-17T12:57:13.2286633Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp64rttnio/_remote_module_non_scriptable.py 2022-08-17T12:57:13.6520431Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:57:13.6536777Z 2022-08-17T12:57:13.6536925Z Running tests... 2022-08-17T12:57:13.6537375Z ---------------------------------------------------------------------- 2022-08-17T12:57:15.1459125Z test_rref_to_here_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:57:15.1643159Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27349 2022-08-17T12:57:15.1649377Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27350 2022-08-17T12:57:15.1655502Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 27351 2022-08-17T12:57:15.1661947Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 27352 2022-08-17T12:57:16.5697458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:57:16.5697984Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:57:16.5698762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:57:16.5699562Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:57:16.5867409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:57:16.5867869Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:57:16.5870792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:57:16.5871249Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:57:16.5896326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:57:16.5896780Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:57:16.5899998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:57:16.5900464Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:57:16.6052044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:57:16.6052496Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:57:16.6055525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:57:16.6055981Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:57:16.7382631Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpprdfhsbd 2022-08-17T12:57:16.7384544Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpprdfhsbd/_remote_module_non_scriptable.py 2022-08-17T12:57:16.7598197Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpviztmhtc 2022-08-17T12:57:16.7600866Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpviztmhtc/_remote_module_non_scriptable.py 2022-08-17T12:57:16.7650655Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp67k216yk 2022-08-17T12:57:16.7653774Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp67k216yk/_remote_module_non_scriptable.py 2022-08-17T12:57:16.7764076Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpps2mz7ty 2022-08-17T12:57:16.7766759Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpps2mz7ty/_remote_module_non_scriptable.py 2022-08-17T12:57:17.1670905Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:57:17.1900932Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:57:17.2008047Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:57:17.2057110Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:57:17.2692428Z fi_getinfo: -61 2022-08-17T12:57:17.2918564Z fi_getinfo: -61 2022-08-17T12:57:17.3028707Z fi_getinfo: -61 2022-08-17T12:57:17.3076247Z fi_getinfo: -61 2022-08-17T12:57:31.0038301Z ok (17.350s) 2022-08-17T12:57:31.0038558Z 2022-08-17T12:57:31.0038950Z ---------------------------------------------------------------------- 2022-08-17T12:57:31.0039284Z Ran 1 test in 17.350s 2022-08-17T12:57:31.0039453Z 2022-08-17T12:57:31.0039548Z OK 2022-08-17T12:57:31.0039682Z 2022-08-17T12:57:31.0039814Z Generating XML reports... 2022-08-17T12:57:31.0075142Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125713.xml 2022-08-17T12:57:32.7496507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:57:32.7497022Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:57:32.7498224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:57:32.7498683Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:57:32.9252263Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyxg44koc 2022-08-17T12:57:32.9254452Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyxg44koc/_remote_module_non_scriptable.py 2022-08-17T12:57:33.3490505Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:57:33.3507811Z 2022-08-17T12:57:33.3508073Z Running tests... 2022-08-17T12:57:33.3508486Z ---------------------------------------------------------------------- 2022-08-17T12:57:34.8608216Z test_rref_to_here_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:57:34.8795002Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27696 2022-08-17T12:57:34.8801836Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27697 2022-08-17T12:57:34.8809069Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 27698 2022-08-17T12:57:34.8815714Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 27699 2022-08-17T12:57:36.2752134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:57:36.2753124Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:57:36.2754280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:57:36.2755213Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:57:36.2794169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:57:36.2794663Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:57:36.2795865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:57:36.2796329Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:57:36.2797315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:57:36.2797833Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:57:36.2799450Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:57:36.2799924Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:57:36.3033052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:57:36.3033521Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:57:36.3036186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:57:36.3036966Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:57:36.4437059Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp09yyfgnx 2022-08-17T12:57:36.4439005Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp09yyfgnx/_remote_module_non_scriptable.py 2022-08-17T12:57:36.4534621Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxfy27qqp 2022-08-17T12:57:36.4537430Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxfy27qqp/_remote_module_non_scriptable.py 2022-08-17T12:57:36.4546473Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptczr_ku5 2022-08-17T12:57:36.4549622Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptczr_ku5/_remote_module_non_scriptable.py 2022-08-17T12:57:36.4785415Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprwfp6wrq 2022-08-17T12:57:36.4788302Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprwfp6wrq/_remote_module_non_scriptable.py 2022-08-17T12:57:36.8716015Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:57:36.8801381Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:57:36.8835842Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:57:36.9131411Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:57:36.9736191Z fi_getinfo: -61 2022-08-17T12:57:36.9823675Z fi_getinfo: -61 2022-08-17T12:57:36.9853765Z fi_getinfo: -61 2022-08-17T12:57:37.0153423Z fi_getinfo: -61 2022-08-17T12:57:52.7242809Z ok (19.373s) 2022-08-17T12:57:52.7243141Z 2022-08-17T12:57:52.7243570Z ---------------------------------------------------------------------- 2022-08-17T12:57:52.7243896Z Ran 1 test in 19.373s 2022-08-17T12:57:52.7244616Z 2022-08-17T12:57:52.7244894Z OK 2022-08-17T12:57:52.7245138Z 2022-08-17T12:57:52.7245379Z Generating XML reports... 2022-08-17T12:57:52.7285019Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125733.xml 2022-08-17T12:57:54.5043359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:57:54.5043895Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:57:54.5044823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:57:54.5045306Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:57:54.6776686Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp95dynnl2 2022-08-17T12:57:54.6779272Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp95dynnl2/_remote_module_non_scriptable.py 2022-08-17T12:57:55.1039079Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:57:55.1055312Z 2022-08-17T12:57:55.1055792Z Running tests... 2022-08-17T12:57:55.1056270Z ---------------------------------------------------------------------- 2022-08-17T12:57:56.6097405Z test_rref_to_here_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:57:56.6282086Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28049 2022-08-17T12:57:56.6288194Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28050 2022-08-17T12:57:56.6294539Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 28051 2022-08-17T12:57:56.6300722Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 28052 2022-08-17T12:57:58.0180290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:57:58.0181661Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:57:58.0182857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:57:58.0184099Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:57:58.0352708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:57:58.0353620Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:57:58.0355473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:57:58.0356387Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:57:58.0627679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:57:58.0628569Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:57:58.0631698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:57:58.0632661Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:57:58.0782581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:57:58.0783872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:57:58.0785387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:57:58.0786349Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:57:58.1861300Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk3i8jshb 2022-08-17T12:57:58.1862481Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk3i8jshb/_remote_module_non_scriptable.py 2022-08-17T12:57:58.2075808Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpic0t9_6v 2022-08-17T12:57:58.2078167Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpic0t9_6v/_remote_module_non_scriptable.py 2022-08-17T12:57:58.2313841Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0dp92fbl 2022-08-17T12:57:58.2316248Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0dp92fbl/_remote_module_non_scriptable.py 2022-08-17T12:57:58.2549053Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw376sbe0 2022-08-17T12:57:58.2550965Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw376sbe0/_remote_module_non_scriptable.py 2022-08-17T12:57:58.6070491Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:57:58.6378557Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:57:58.6522655Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:57:58.6816138Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:57:58.7091620Z fi_getinfo: -61 2022-08-17T12:57:58.7398735Z fi_getinfo: -61 2022-08-17T12:57:58.7539615Z fi_getinfo: -61 2022-08-17T12:57:58.7835743Z fi_getinfo: -61 2022-08-17T12:58:12.4681541Z ok (17.362s) 2022-08-17T12:58:12.4681763Z 2022-08-17T12:58:12.4682159Z ---------------------------------------------------------------------- 2022-08-17T12:58:12.4682504Z Ran 1 test in 17.362s 2022-08-17T12:58:12.4682668Z 2022-08-17T12:58:12.4682747Z OK 2022-08-17T12:58:12.4682886Z 2022-08-17T12:58:12.4683022Z Generating XML reports... 2022-08-17T12:58:12.4719283Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125755.xml 2022-08-17T12:58:14.2074574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:58:14.2075682Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:58:14.2076430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:58:14.2076892Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:58:14.3806625Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnaxprzia 2022-08-17T12:58:14.3809147Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnaxprzia/_remote_module_non_scriptable.py 2022-08-17T12:58:14.8056708Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:58:14.8072199Z 2022-08-17T12:58:14.8072608Z Running tests... 2022-08-17T12:58:14.8073528Z ---------------------------------------------------------------------- 2022-08-17T12:58:16.3132380Z test_rref_to_here_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:58:16.3316127Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28396 2022-08-17T12:58:16.3322670Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28397 2022-08-17T12:58:16.3329063Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 28398 2022-08-17T12:58:16.3335384Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 28399 2022-08-17T12:58:17.7285885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:58:17.7286394Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:58:17.7287145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:58:17.7287626Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:58:17.7288230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:58:17.7288683Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:58:17.7290347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:58:17.7290807Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:58:17.7345975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:58:17.7346437Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:58:17.7349452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:58:17.7349918Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:58:17.7459492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:58:17.7459951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:58:17.7462872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:58:17.7463332Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:58:17.9073680Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgmlrvleq 2022-08-17T12:58:17.9074251Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7mww8ppq 2022-08-17T12:58:17.9075132Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgmlrvleq/_remote_module_non_scriptable.py 2022-08-17T12:58:17.9075704Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7mww8ppq/_remote_module_non_scriptable.py 2022-08-17T12:58:17.9107847Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiewdnccu 2022-08-17T12:58:17.9111050Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiewdnccu/_remote_module_non_scriptable.py 2022-08-17T12:58:17.9190554Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpypys5z77 2022-08-17T12:58:17.9193590Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpypys5z77/_remote_module_non_scriptable.py 2022-08-17T12:58:18.3376355Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:58:18.3393823Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:58:18.3433776Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:58:18.3520810Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:58:18.4397669Z fi_getinfo: -61 2022-08-17T12:58:18.4413680Z fi_getinfo: -61 2022-08-17T12:58:18.4452002Z fi_getinfo: -61 2022-08-17T12:58:18.4540811Z fi_getinfo: -61 2022-08-17T12:58:34.5765930Z ok (19.769s) 2022-08-17T12:58:34.5766297Z 2022-08-17T12:58:34.5766726Z ---------------------------------------------------------------------- 2022-08-17T12:58:34.5767073Z Ran 1 test in 19.769s 2022-08-17T12:58:34.5767242Z 2022-08-17T12:58:34.5770830Z OK 2022-08-17T12:58:34.5771064Z 2022-08-17T12:58:34.5771631Z Generating XML reports... 2022-08-17T12:58:34.5804117Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125814.xml 2022-08-17T12:58:36.3333489Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:58:36.3334036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:58:36.3335354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:58:36.3335848Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:58:36.5066206Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_6pg8pcn 2022-08-17T12:58:36.5069581Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_6pg8pcn/_remote_module_non_scriptable.py 2022-08-17T12:58:36.9287297Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:58:36.9303476Z 2022-08-17T12:58:36.9303620Z Running tests... 2022-08-17T12:58:36.9305022Z ---------------------------------------------------------------------- 2022-08-17T12:58:38.4529924Z test_rref_with_unpickleable_attributes (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:58:38.4715594Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28749 2022-08-17T12:58:38.4722255Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28750 2022-08-17T12:58:38.4729170Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 28751 2022-08-17T12:58:38.4735729Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 28752 2022-08-17T12:58:39.8543484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:58:39.8544791Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:58:39.8545404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:58:39.8545864Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:58:39.8705849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:58:39.8706332Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:58:39.8709614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:58:39.8710092Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:58:39.8891030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:58:39.8891483Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:58:39.8895040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:58:39.8895500Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:58:39.9206693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:58:39.9207379Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:58:39.9209758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:58:39.9210220Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:58:40.0211706Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprhitdl47 2022-08-17T12:58:40.0213173Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprhitdl47/_remote_module_non_scriptable.py 2022-08-17T12:58:40.0372004Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9vwlftx1 2022-08-17T12:58:40.0374883Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9vwlftx1/_remote_module_non_scriptable.py 2022-08-17T12:58:40.0637057Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps1h13vsv 2022-08-17T12:58:40.0639978Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps1h13vsv/_remote_module_non_scriptable.py 2022-08-17T12:58:40.0903054Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3hqk_zju 2022-08-17T12:58:40.0904237Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3hqk_zju/_remote_module_non_scriptable.py 2022-08-17T12:58:40.4404647Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:58:40.4594345Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:58:40.5013335Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:58:40.5090623Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:58:40.5423478Z fi_getinfo: -61 2022-08-17T12:58:40.5612645Z fi_getinfo: -61 2022-08-17T12:58:40.6031116Z fi_getinfo: -61 2022-08-17T12:58:40.6108598Z fi_getinfo: -61 2022-08-17T12:58:44.0908284Z ok (7.160s) 2022-08-17T12:58:44.0908707Z 2022-08-17T12:58:44.0909471Z ---------------------------------------------------------------------- 2022-08-17T12:58:44.0910139Z Ran 1 test in 7.160s 2022-08-17T12:58:44.0910293Z 2022-08-17T12:58:44.0910390Z OK 2022-08-17T12:58:44.0910529Z 2022-08-17T12:58:44.0910685Z Generating XML reports... 2022-08-17T12:58:44.0946282Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125836.xml 2022-08-17T12:58:45.8504033Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:58:45.8504539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:58:45.8506077Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:58:45.8506744Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:58:46.0237145Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe50hao71 2022-08-17T12:58:46.0239319Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe50hao71/_remote_module_non_scriptable.py 2022-08-17T12:58:46.4473063Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:58:46.4487995Z 2022-08-17T12:58:46.4488219Z Running tests... 2022-08-17T12:58:46.4489053Z ---------------------------------------------------------------------- 2022-08-17T12:58:47.9674187Z test_tensor_view_as_return_value (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:58:47.9858393Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29100 2022-08-17T12:58:47.9864128Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29101 2022-08-17T12:58:47.9870794Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 29102 2022-08-17T12:58:47.9877195Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 29103 2022-08-17T12:58:49.4055092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:58:49.4055605Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:58:49.4056519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:58:49.4057117Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:58:49.4261326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:58:49.4261788Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:58:49.4264955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:58:49.4265466Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:58:49.4440099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:58:49.4440561Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:58:49.4443613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:58:49.4444494Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:58:49.4715379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:58:49.4716195Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:58:49.4718884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:58:49.4719639Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:58:49.5817207Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpplw8j08u 2022-08-17T12:58:49.5818673Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpplw8j08u/_remote_module_non_scriptable.py 2022-08-17T12:58:49.5928794Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7oavbiqt 2022-08-17T12:58:49.5931542Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7oavbiqt/_remote_module_non_scriptable.py 2022-08-17T12:58:49.6126847Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps9qq_f9s 2022-08-17T12:58:49.6129668Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps9qq_f9s/_remote_module_non_scriptable.py 2022-08-17T12:58:49.6506744Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkk_lam_f 2022-08-17T12:58:49.6508261Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkk_lam_f/_remote_module_non_scriptable.py 2022-08-17T12:58:50.0215732Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:58:50.0237773Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:58:50.0308059Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:58:50.0798395Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:58:50.1238153Z fi_getinfo: -61 2022-08-17T12:58:50.1256255Z fi_getinfo: -61 2022-08-17T12:58:50.1326352Z fi_getinfo: -61 2022-08-17T12:58:50.1816450Z fi_getinfo: -61 2022-08-17T12:58:55.6070567Z ok (9.158s) 2022-08-17T12:58:55.6070800Z 2022-08-17T12:58:55.6071195Z ---------------------------------------------------------------------- 2022-08-17T12:58:55.6071518Z Ran 1 test in 9.158s 2022-08-17T12:58:55.6071687Z 2022-08-17T12:58:55.6071781Z OK 2022-08-17T12:58:55.6072979Z 2022-08-17T12:58:55.6073587Z Generating XML reports... 2022-08-17T12:58:55.6108429Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125846.xml 2022-08-17T12:58:57.3584287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:58:57.3584796Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:58:57.3586800Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:58:57.3587288Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:58:57.5330324Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn4di27uy 2022-08-17T12:58:57.5332926Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn4di27uy/_remote_module_non_scriptable.py 2022-08-17T12:58:57.9625569Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:58:57.9642153Z 2022-08-17T12:58:57.9642294Z Running tests... 2022-08-17T12:58:57.9643148Z ---------------------------------------------------------------------- 2022-08-17T12:58:59.4630115Z test_device_maps_backward_pass (__main__.TensorPipeTensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:58:59.4809067Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29747 2022-08-17T12:58:59.4815256Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29748 2022-08-17T12:58:59.4821423Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 29749 2022-08-17T12:58:59.4827964Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 29750 2022-08-17T12:59:00.8966418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:00.8966927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:00.8968115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:00.8968578Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:00.8972818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:00.8973275Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:00.8976356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:00.8976814Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:00.9034213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:00.9034669Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:00.9037452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:00.9037913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:00.9061823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:00.9062293Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:00.9065135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:00.9065586Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:01.0716757Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzqk2sm4n 2022-08-17T12:59:01.0717553Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzqk2sm4n/_remote_module_non_scriptable.py 2022-08-17T12:59:01.0749001Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvngikimh 2022-08-17T12:59:01.0751692Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvngikimh/_remote_module_non_scriptable.py 2022-08-17T12:59:01.0822386Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpayhuelrm 2022-08-17T12:59:01.0824895Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpayhuelrm/_remote_module_non_scriptable.py 2022-08-17T12:59:01.0862821Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiy4t7gyd 2022-08-17T12:59:01.0866105Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiy4t7gyd/_remote_module_non_scriptable.py 2022-08-17T12:59:01.5028725Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:59:01.5032847Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:59:01.5107771Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:59:01.5237903Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:59:01.8898472Z skip: Need at least 4 CUDA devices (3.925s) 2022-08-17T12:59:01.8898907Z 2022-08-17T12:59:01.8899570Z ---------------------------------------------------------------------- 2022-08-17T12:59:01.8900149Z Ran 1 test in 3.925s 2022-08-17T12:59:01.8900438Z 2022-08-17T12:59:01.8900620Z OK (skipped=1) 2022-08-17T12:59:01.8900895Z 2022-08-17T12:59:01.8901123Z Generating XML reports... 2022-08-17T12:59:01.8939383Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20220817125857.xml 2022-08-17T12:59:03.6904207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:03.6904717Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:03.6905857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:03.6906353Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:03.8649518Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppski1xzr 2022-08-17T12:59:03.8651987Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppski1xzr/_remote_module_non_scriptable.py 2022-08-17T12:59:04.2903306Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:59:04.2919502Z 2022-08-17T12:59:04.2919755Z Running tests... 2022-08-17T12:59:04.2920208Z ---------------------------------------------------------------------- 2022-08-17T12:59:05.8089097Z test_dist_autograd_sync_streams (__main__.TensorPipeTensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:59:05.8274194Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29918 2022-08-17T12:59:05.8280416Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29919 2022-08-17T12:59:05.8286515Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 29920 2022-08-17T12:59:05.8293515Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 29921 2022-08-17T12:59:07.2200399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:07.2200914Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:07.2201490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:07.2201969Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:07.2578179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:07.2578657Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:07.2581023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:07.2581515Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:07.2600985Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:07.2601875Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:07.2605084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:07.2605545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:07.2686944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:07.2687398Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:07.2690742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:07.2691203Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:07.3881248Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptr97_qnu 2022-08-17T12:59:07.3882446Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptr97_qnu/_remote_module_non_scriptable.py 2022-08-17T12:59:07.4291739Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgfhhly2z 2022-08-17T12:59:07.4294176Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgfhhly2z/_remote_module_non_scriptable.py 2022-08-17T12:59:07.4322552Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjley8j5i 2022-08-17T12:59:07.4325501Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjley8j5i/_remote_module_non_scriptable.py 2022-08-17T12:59:07.4354819Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyrzgq12_ 2022-08-17T12:59:07.4357882Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyrzgq12_/_remote_module_non_scriptable.py 2022-08-17T12:59:07.8055437Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:59:07.8587173Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:59:07.8587690Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:59:07.8592931Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:59:08.2363814Z skip: Need at least 4 CUDA devices (3.944s) 2022-08-17T12:59:08.2364322Z 2022-08-17T12:59:08.2365009Z ---------------------------------------------------------------------- 2022-08-17T12:59:08.2365373Z Ran 1 test in 3.944s 2022-08-17T12:59:08.2365536Z 2022-08-17T12:59:08.2365649Z OK (skipped=1) 2022-08-17T12:59:08.2365823Z 2022-08-17T12:59:08.2365934Z Generating XML reports... 2022-08-17T12:59:08.2402523Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20220817125904.xml 2022-08-17T12:59:10.0163822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:10.0164921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:10.0165686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:10.0166160Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:10.1935862Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp533png29 2022-08-17T12:59:10.1938239Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp533png29/_remote_module_non_scriptable.py 2022-08-17T12:59:10.6220696Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-08-17T12:59:10.6236766Z 2022-08-17T12:59:10.6237140Z Running tests... 2022-08-17T12:59:10.6238081Z ---------------------------------------------------------------------- 2022-08-17T12:59:12.1365518Z test_gradients_synchronizations (__main__.TensorPipeTensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:59:12.1544445Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30089 2022-08-17T12:59:12.1551307Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30090 2022-08-17T12:59:12.1557274Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 30091 2022-08-17T12:59:12.1563460Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 30092 2022-08-17T12:59:13.5385491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:13.5386401Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:13.5387603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:13.5388096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:13.5495528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:13.5496272Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:13.5498721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:13.5499460Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:13.5707426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:13.5708389Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:13.5709209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:13.5709685Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:13.5751920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:13.5752659Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:13.5755423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:13.5756169Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:13.7071650Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp14jimcrp 2022-08-17T12:59:13.7073002Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp14jimcrp/_remote_module_non_scriptable.py 2022-08-17T12:59:13.7173761Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbgbupqw1 2022-08-17T12:59:13.7176781Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbgbupqw1/_remote_module_non_scriptable.py 2022-08-17T12:59:13.7437023Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp46p5oowy 2022-08-17T12:59:13.7439503Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmuupvujl 2022-08-17T12:59:13.7440515Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp46p5oowy/_remote_module_non_scriptable.py 2022-08-17T12:59:13.7442114Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmuupvujl/_remote_module_non_scriptable.py 2022-08-17T12:59:14.1324759Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T12:59:14.1476414Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T12:59:14.1645814Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:59:14.1755036Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:59:14.5635048Z skip: Need at least 4 CUDA devices (3.939s) 2022-08-17T12:59:14.5635574Z 2022-08-17T12:59:14.5636232Z ---------------------------------------------------------------------- 2022-08-17T12:59:14.5636570Z Ran 1 test in 3.940s 2022-08-17T12:59:14.5636742Z 2022-08-17T12:59:14.5636857Z OK (skipped=1) 2022-08-17T12:59:14.5637014Z 2022-08-17T12:59:14.5637144Z Generating XML reports... 2022-08-17T12:59:14.5673874Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20220817125910.xml 2022-08-17T12:59:15.1474667Z Running distributed/test_c10d_nccl ... [2022-08-17 12:59:15.146964] 2022-08-17T12:59:15.1475481Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_nccl.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 12:59:15.147043] 2022-08-17T12:59:16.7357439Z , <__main__.CommTest testMethod=test_broadcast_coalesced_nccl>, <__main__.CommTest testMethod=test_nccl_barrier>, <__main__.CommTest testMethod=test_nccl_barrier_device_ids>, <__main__.CommTest testMethod=test_nccl_barrier_device_ids_function_argument>, <__main__.CommTest testMethod=test_nccl_barrier_timeout>, <__main__.CommTest testMethod=test_nccl_barrier_timeout_new_group>, <__main__.CommTest testMethod=test_nccl_barrier_timeout_new_group_non_member>, <__main__.CommTest testMethod=test_nccl_warn_not_in_group_debug_detail>, <__main__.CommTest testMethod=test_nccl_warn_not_in_group_debug_info>, <__main__.CommTest testMethod=test_nccl_warn_not_in_group_debug_off>, <__main__.CommTest testMethod=test_pass_nccl_options_high_priority_stream>, <__main__.CommTest testMethod=test_sequence_num_incremented_nccl_default>, <__main__.CommTest testMethod=test_sequence_num_incremented_nccl_subgroup>, <__main__.CommTest testMethod=test_sequence_num_set_default_pg_nccl>, <__main__.CommTest testMethod=test_sequence_num_set_nccl_new_group>]> 2022-08-17T12:59:16.7360237Z test_all_reduce_coalesced_nccl (__main__.CommTest) 2022-08-17T12:59:16.7360577Z test_broadcast_coalesced_nccl (__main__.CommTest) 2022-08-17T12:59:16.7360893Z test_nccl_barrier (__main__.CommTest) 2022-08-17T12:59:16.7361215Z test_nccl_barrier_device_ids (__main__.CommTest) 2022-08-17T12:59:16.7361576Z test_nccl_barrier_device_ids_function_argument (__main__.CommTest) 2022-08-17T12:59:16.7361913Z test_nccl_barrier_timeout (__main__.CommTest) 2022-08-17T12:59:16.7362253Z test_nccl_barrier_timeout_new_group (__main__.CommTest) 2022-08-17T12:59:16.7362624Z test_nccl_barrier_timeout_new_group_non_member (__main__.CommTest) 2022-08-17T12:59:16.7362985Z test_nccl_warn_not_in_group_debug_detail (__main__.CommTest) 2022-08-17T12:59:16.7363356Z test_nccl_warn_not_in_group_debug_info (__main__.CommTest) 2022-08-17T12:59:16.7363708Z test_nccl_warn_not_in_group_debug_off (__main__.CommTest) 2022-08-17T12:59:16.7364054Z test_pass_nccl_options_high_priority_stream (__main__.CommTest) 2022-08-17T12:59:16.7364643Z test_sequence_num_incremented_nccl_default (__main__.CommTest) 2022-08-17T12:59:16.7365044Z test_sequence_num_incremented_nccl_subgroup (__main__.CommTest) 2022-08-17T12:59:16.7365420Z test_sequence_num_set_default_pg_nccl (__main__.CommTest) 2022-08-17T12:59:16.7365754Z test_sequence_num_set_nccl_new_group (__main__.CommTest) 2022-08-17T12:59:16.7374835Z , <__main__.DistributedDataParallelTest testMethod=test_accumulate_gradients_module_with_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_arbitrary_forward_return_value>, <__main__.DistributedDataParallelTest testMethod=test_arbitrary_forward_return_value_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_bf16_compress_wrapper_is_view>, <__main__.DistributedDataParallelTest testMethod=test_bf16_compress_wrapper_nccl>, <__main__.DistributedDataParallelTest testMethod=test_builtin_ddp_comm_hooks_nccl>, <__main__.DistributedDataParallelTest testMethod=test_builtin_ddp_comm_hooks_nccl_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_channels_last_contig>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_dynamic_module>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_dynamic_weight_sharing>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_once_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_once_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_static_graph_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_static_graph_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_weight_sharing>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_unused_params_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_unused_params_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_weight_sharing_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_weight_sharing_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_allreduce_hook_nccl>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_allreduce_hook_nccl_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_allreduce_hook_nccl_static_graph>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_allreduce_with_then_hook_nccl>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_future_passing_gpu_nccl>, <__main__.DistributedDataParallelTest testMethod=test_ddp_multi_device_module_config>, <__main__.DistributedDataParallelTest testMethod=test_ddp_weight_sharing>, <__main__.DistributedDataParallelTest testMethod=test_ddp_with_lazy_parameters>, <__main__.DistributedDataParallelTest testMethod=test_default_ddp_comm_hooks_nccl>, <__main__.DistributedDataParallelTest testMethod=test_default_ddp_comm_hooks_nccl_is_view>, <__main__.DistributedDataParallelTest testMethod=test_failure_recovery>, <__main__.DistributedDataParallelTest testMethod=test_find_unused_parameters_kwarg_debug_detail>, <__main__.DistributedDataParallelTest testMethod=test_find_unused_parameters_kwarg_debug_info>, <__main__.DistributedDataParallelTest testMethod=test_find_unused_parameters_kwarg_debug_off>, <__main__.DistributedDataParallelTest testMethod=test_find_unused_parameters_kwarg_grad_is_view_debug_detail>, <__main__.DistributedDataParallelTest testMethod=test_find_unused_parameters_kwarg_grad_is_view_debug_info>, <__main__.DistributedDataParallelTest testMethod=test_find_unused_parameters_kwarg_grad_is_view_debug_off>, <__main__.DistributedDataParallelTest testMethod=test_fp16>, <__main__.DistributedDataParallelTest testMethod=test_fp16_compress_wrapper_is_view>, <__main__.DistributedDataParallelTest testMethod=test_fp16_compress_wrapper_nccl>, <__main__.DistributedDataParallelTest testMethod=test_fp16_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_grad_layout_1devicemodule_1replicaperprocess>, <__main__.DistributedDataParallelTest testMethod=test_grad_layout_2devicemodule>, <__main__.DistributedDataParallelTest testMethod=test_invalid_powerSGD_state>, <__main__.DistributedDataParallelTest testMethod=test_multiple_outputs_multiple_backward>, <__main__.DistributedDataParallelTest testMethod=test_multiple_outputs_multiple_backward_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_nccl_backend_1gpu_module_device_ids_integer_list>, <__main__.DistributedDataParallelTest testMethod=test_nccl_backend_1gpu_module_device_ids_torch_device_list>, <__main__.DistributedDataParallelTest testMethod=test_nccl_backend_2gpu_module>, <__main__.DistributedDataParallelTest testMethod=test_nccl_backend_4gpu_module>, <__main__.DistributedDataParallelTest testMethod=test_nccl_backend_multi_device_ids_not_allowed>, <__main__.DistributedDataParallelTest testMethod=test_nccl_backend_multi_device_module_device_ids_None>, <__main__.DistributedDataParallelTest testMethod=test_nccl_backend_single_device_module_device_ids_None>, <__main__.DistributedDataParallelTest testMethod=test_nccl_backend_single_device_module_empty_device_ids>, <__main__.DistributedDataParallelTest testMethod=test_nccl_propagate_error_reason>, <__main__.DistributedDataParallelTest testMethod=test_no_grad>, <__main__.DistributedDataParallelTest testMethod=test_param_layout_mismatch_error>, <__main__.DistributedDataParallelTest testMethod=test_pass_default_pg>, <__main__.DistributedDataParallelTest testMethod=test_powerSGD_ddp_comm_hook_nccl>, <__main__.DistributedDataParallelTest testMethod=test_powerSGD_ddp_comm_hook_nccl_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_sync_batch_norm_empty_input>, <__main__.DistributedDataParallelTest testMethod=test_sync_batch_norm_only_empty_input>]> 2022-08-17T12:59:16.7384081Z test_accumulate_gradients_module (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7384558Z test_accumulate_gradients_module_with_grad_is_view (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7385007Z test_arbitrary_forward_return_value (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7385466Z test_arbitrary_forward_return_value_grad_is_view (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7385921Z test_bf16_compress_wrapper_is_view (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7386347Z test_bf16_compress_wrapper_nccl (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7386778Z test_builtin_ddp_comm_hooks_nccl (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7387236Z test_builtin_ddp_comm_hooks_nccl_grad_is_view (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7387677Z test_channels_last_contig (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7388096Z test_ddp_checkpointing_dynamic_module (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7388567Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7389050Z test_ddp_checkpointing_once_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7389513Z test_ddp_checkpointing_once_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7390017Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7390539Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7391044Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7391591Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7392081Z test_ddp_checkpointing_twice_weight_sharing (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7392574Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7393061Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7393617Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7394125Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7394602Z test_ddp_comm_hook_allreduce_hook_nccl (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7395139Z test_ddp_comm_hook_allreduce_hook_nccl_grad_is_view (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7395605Z test_ddp_comm_hook_allreduce_hook_nccl_static_graph (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7396085Z test_ddp_comm_hook_allreduce_with_then_hook_nccl (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7396554Z test_ddp_comm_hook_future_passing_gpu_nccl (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7396985Z test_ddp_multi_device_module_config (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7397409Z test_ddp_weight_sharing (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7397831Z test_ddp_with_lazy_parameters (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7398265Z test_default_ddp_comm_hooks_nccl (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7398691Z test_default_ddp_comm_hooks_nccl_is_view (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7399118Z test_failure_recovery (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7399568Z test_find_unused_parameters_kwarg_debug_detail (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7400023Z test_find_unused_parameters_kwarg_debug_info (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7400495Z test_find_unused_parameters_kwarg_debug_off (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7400983Z test_find_unused_parameters_kwarg_grad_is_view_debug_detail (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7401487Z test_find_unused_parameters_kwarg_grad_is_view_debug_info (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7401969Z test_find_unused_parameters_kwarg_grad_is_view_debug_off (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7402392Z test_fp16 (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7402795Z test_fp16_compress_wrapper_is_view (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7403211Z test_fp16_compress_wrapper_nccl (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7403630Z test_fp16_grad_is_view (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7404084Z test_grad_layout_1devicemodule_1replicaperprocess (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7404552Z test_grad_layout_2devicemodule (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7404963Z test_invalid_powerSGD_state (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7405404Z test_multiple_outputs_multiple_backward (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7405876Z test_multiple_outputs_multiple_backward_grad_is_view (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7406347Z test_nccl_backend_1gpu_module_device_ids_integer_list (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7406842Z test_nccl_backend_1gpu_module_device_ids_torch_device_list (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7407304Z test_nccl_backend_2gpu_module (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7407737Z test_nccl_backend_4gpu_module (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7408170Z test_nccl_backend_multi_device_ids_not_allowed (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7408713Z test_nccl_backend_multi_device_module_device_ids_None (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7409211Z test_nccl_backend_single_device_module_device_ids_None (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7409686Z test_nccl_backend_single_device_module_empty_device_ids (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7410147Z test_nccl_propagate_error_reason (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7410548Z test_no_grad (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7410950Z test_param_layout_mismatch_error (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7411349Z test_pass_default_pg (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7411762Z test_powerSGD_ddp_comm_hook_nccl (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7412275Z test_powerSGD_ddp_comm_hook_nccl_grad_is_view (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7412708Z test_sync_batch_norm_empty_input (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7413144Z test_sync_batch_norm_only_empty_input (__main__.DistributedDataParallelTest) 2022-08-17T12:59:16.7413519Z 2022-08-17T12:59:16.7414769Z , <__main__.NcclErrorHandlingTest testMethod=test_nccl_blocking_wait_with_barrier>, <__main__.NcclErrorHandlingTest testMethod=test_nccl_errors_blocking_abort>, <__main__.NcclErrorHandlingTest testMethod=test_nccl_errors_blocking_clean_exit>, <__main__.NcclErrorHandlingTest testMethod=test_nccl_errors_blocking_nonzero_exit>, <__main__.NcclErrorHandlingTest testMethod=test_nccl_errors_blocking_sigkill>, <__main__.NcclErrorHandlingTest testMethod=test_nccl_errors_blocking_sigterm>, <__main__.NcclErrorHandlingTest testMethod=test_nccl_errors_nonblocking>, <__main__.NcclErrorHandlingTest testMethod=test_nccl_timeout>]> 2022-08-17T12:59:16.7416003Z test_invalid_nccl_blocking_wait_env (__main__.NcclErrorHandlingTest) 2022-08-17T12:59:16.7416421Z test_nccl_blocking_wait_with_barrier (__main__.NcclErrorHandlingTest) 2022-08-17T12:59:16.7416805Z test_nccl_errors_blocking_abort (__main__.NcclErrorHandlingTest) 2022-08-17T12:59:16.7417207Z test_nccl_errors_blocking_clean_exit (__main__.NcclErrorHandlingTest) 2022-08-17T12:59:16.7417615Z test_nccl_errors_blocking_nonzero_exit (__main__.NcclErrorHandlingTest) 2022-08-17T12:59:16.7418002Z test_nccl_errors_blocking_sigkill (__main__.NcclErrorHandlingTest) 2022-08-17T12:59:16.7418402Z test_nccl_errors_blocking_sigterm (__main__.NcclErrorHandlingTest) 2022-08-17T12:59:16.7418798Z test_nccl_errors_nonblocking (__main__.NcclErrorHandlingTest) 2022-08-17T12:59:16.7419166Z test_nccl_timeout (__main__.NcclErrorHandlingTest) 2022-08-17T12:59:16.7419607Z ]> 2022-08-17T12:59:16.7420073Z test_init_no_gpus (__main__.ProcessGroupNCCLNoGPUTest) 2022-08-17T12:59:16.7422039Z , <__main__.ProcessGroupNCCLTest testMethod=test_allgather_base_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_allgather_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_allreduce_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_barrier>, <__main__.ProcessGroupNCCLTest testMethod=test_broadcast_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_empty_tensors>, <__main__.ProcessGroupNCCLTest testMethod=test_gather_checks>, <__main__.ProcessGroupNCCLTest testMethod=test_gather_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_gather_stress>, <__main__.ProcessGroupNCCLTest testMethod=test_reduce_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_reduce_scatter_base_basics>, <__main__.ProcessGroupNCCLTest testMethod=test_reduce_scatter_base_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_reduce_scatter_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_scatter_checks>, <__main__.ProcessGroupNCCLTest testMethod=test_scatter_ops>, <__main__.ProcessGroupNCCLTest testMethod=test_scatter_stress>, <__main__.ProcessGroupNCCLTest testMethod=test_send_recv>]> 2022-08-17T12:59:16.7424389Z test_allgather_base_basics (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7424757Z test_allgather_base_ops (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7425123Z test_allgather_ops (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7425479Z test_allreduce_ops (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7425811Z test_barrier (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7426158Z test_broadcast_ops (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7426515Z test_empty_tensors (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7426851Z test_gather_checks (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7427294Z test_gather_ops (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7427644Z test_gather_stress (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7427977Z test_reduce_ops (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7428351Z test_reduce_scatter_base_basics (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7428742Z test_reduce_scatter_base_ops (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7429117Z test_reduce_scatter_ops (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7429464Z test_scatter_checks (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7429819Z test_scatter_ops (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7430172Z test_scatter_stress (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7430503Z test_send_recv (__main__.ProcessGroupNCCLTest) 2022-08-17T12:59:16.7430925Z ]> 2022-08-17T12:59:16.7431337Z test_common_errors (__main__.RendezvousEnvTest) 2022-08-17T12:59:16.7431650Z 2022-08-17T12:59:16.7432066Z ]> 2022-08-17T12:59:16.7432491Z test_default_store_timeout_nccl (__main__.TimeoutTest) 2022-08-17T12:59:18.1309301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:18.1310286Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:18.1311494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:18.1312468Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:18.3076011Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T12:59:18.3093337Z 2022-08-17T12:59:18.3093749Z Running tests... 2022-08-17T12:59:18.3094272Z ---------------------------------------------------------------------- 2022-08-17T12:59:19.8346482Z test_all_reduce_coalesced_nccl (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:59:19.8544524Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30293 2022-08-17T12:59:19.8550900Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30294 2022-08-17T12:59:21.2433707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:21.2434206Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:21.2435216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:21.2435705Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:21.2716862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:21.2717333Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:21.2719948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:21.2720731Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:21.4092458Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:59:21.4441913Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:59:24.1663025Z ok (5.857s) 2022-08-17T12:59:24.1663254Z 2022-08-17T12:59:24.1663847Z ---------------------------------------------------------------------- 2022-08-17T12:59:24.1664202Z Ran 1 test in 5.857s 2022-08-17T12:59:24.1664373Z 2022-08-17T12:59:24.1664471Z OK 2022-08-17T12:59:24.1664614Z 2022-08-17T12:59:24.1664753Z Generating XML reports... 2022-08-17T12:59:24.1698156Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817125918.xml 2022-08-17T12:59:25.9194723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:25.9195510Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:25.9196667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:25.9197176Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:26.0929677Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T12:59:26.0945370Z 2022-08-17T12:59:26.0945631Z Running tests... 2022-08-17T12:59:26.0946073Z ---------------------------------------------------------------------- 2022-08-17T12:59:27.5792828Z test_broadcast_coalesced_nccl (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:59:27.5978786Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30410 2022-08-17T12:59:27.5985269Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30411 2022-08-17T12:59:29.0224376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:29.0225372Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:29.0226587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:29.0227474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:29.0472777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:29.0473711Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:29.0475730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:29.0476683Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:29.1888646Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:59:29.2209238Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:59:32.0145394Z ok (5.920s) 2022-08-17T12:59:32.0145728Z 2022-08-17T12:59:32.0146289Z ---------------------------------------------------------------------- 2022-08-17T12:59:32.0146619Z Ran 1 test in 5.920s 2022-08-17T12:59:32.0146790Z 2022-08-17T12:59:32.0146887Z OK 2022-08-17T12:59:32.0147025Z 2022-08-17T12:59:32.0147164Z Generating XML reports... 2022-08-17T12:59:32.0181174Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817125926.xml 2022-08-17T12:59:33.7623641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:33.7624661Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:33.7625847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:33.7627196Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:33.9382965Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T12:59:33.9400941Z 2022-08-17T12:59:33.9401399Z Running tests... 2022-08-17T12:59:33.9402017Z ---------------------------------------------------------------------- 2022-08-17T12:59:35.4485586Z test_nccl_barrier (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:59:35.4682040Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30527 2022-08-17T12:59:35.4688619Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30528 2022-08-17T12:59:36.8869583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:36.8870420Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:36.8871207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:36.8871688Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:36.8998987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:36.8999451Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:36.9002217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:36.9002691Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:37.0543062Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:59:37.0711338Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:59:37.3744663Z skip: Need at least 4 CUDA devices (3.434s) 2022-08-17T12:59:37.3745036Z 2022-08-17T12:59:37.3745515Z ---------------------------------------------------------------------- 2022-08-17T12:59:37.3745871Z Ran 1 test in 3.434s 2022-08-17T12:59:37.3746037Z 2022-08-17T12:59:37.3746132Z OK (skipped=1) 2022-08-17T12:59:37.3746814Z 2022-08-17T12:59:37.3746961Z Generating XML reports... 2022-08-17T12:59:37.3782793Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817125933.xml 2022-08-17T12:59:39.1705010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:39.1705585Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:39.1706873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:39.1707701Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:39.3464994Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T12:59:39.3480656Z 2022-08-17T12:59:39.3480880Z Running tests... 2022-08-17T12:59:39.3481756Z ---------------------------------------------------------------------- 2022-08-17T12:59:40.8488049Z test_nccl_barrier_device_ids (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:59:40.8684813Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30630 2022-08-17T12:59:40.8691234Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30631 2022-08-17T12:59:42.2269246Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:42.2270246Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:42.2271709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:42.2272682Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:42.2958197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:42.2959168Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:42.2960336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:42.2961286Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:42.3950325Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:59:42.3953657Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T12:59:42.4686670Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:59:42.4690029Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T12:59:42.4691034Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T12:59:42.4768884Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T12:59:44.2781553Z ok (4.930s) 2022-08-17T12:59:44.2781809Z 2022-08-17T12:59:44.2782213Z ---------------------------------------------------------------------- 2022-08-17T12:59:44.2782557Z Ran 1 test in 4.930s 2022-08-17T12:59:44.2782726Z 2022-08-17T12:59:44.2782812Z OK 2022-08-17T12:59:44.2782949Z 2022-08-17T12:59:44.2783086Z Generating XML reports... 2022-08-17T12:59:44.2818947Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817125939.xml 2022-08-17T12:59:46.0336223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:46.0337092Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:46.0337754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:46.0338227Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:46.2023576Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T12:59:46.2039077Z 2022-08-17T12:59:46.2039396Z Running tests... 2022-08-17T12:59:46.2039817Z ---------------------------------------------------------------------- 2022-08-17T12:59:47.6715355Z test_nccl_barrier_device_ids_function_argument (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:59:47.6903068Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30746 2022-08-17T12:59:47.6908570Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30747 2022-08-17T12:59:49.1212119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:49.1212650Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:49.1213739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:49.1214218Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:49.1387861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:49.1388325Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:49.1391609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:49.1392088Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:49.2871138Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:59:49.2874573Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T12:59:49.3106103Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:59:49.3110152Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T12:59:49.3110893Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T12:59:49.3181376Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T12:59:49.5961550Z ok (3.392s) 2022-08-17T12:59:49.5961952Z 2022-08-17T12:59:49.5962675Z ---------------------------------------------------------------------- 2022-08-17T12:59:49.5963320Z Ran 1 test in 3.392s 2022-08-17T12:59:49.5963485Z 2022-08-17T12:59:49.5963580Z OK 2022-08-17T12:59:49.5963715Z 2022-08-17T12:59:49.5963850Z Generating XML reports... 2022-08-17T12:59:49.5996812Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817125946.xml 2022-08-17T12:59:51.3773918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:51.3774560Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:51.3775860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:51.3776501Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:51.5547140Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T12:59:51.5562575Z 2022-08-17T12:59:51.5562796Z Running tests... 2022-08-17T12:59:51.5563698Z ---------------------------------------------------------------------- 2022-08-17T12:59:53.0772201Z test_nccl_barrier_timeout (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:59:53.0968179Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30853 2022-08-17T12:59:53.0974426Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30854 2022-08-17T12:59:54.5213913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:54.5214390Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:54.5215399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:54.5215877Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:54.5433704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:54.5434165Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:54.5436614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:54.5437091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:54.6903673Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T12:59:54.7152061Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T12:59:55.0027616Z skip: Need at least 4 CUDA devices (3.446s) 2022-08-17T12:59:55.0027873Z 2022-08-17T12:59:55.0028505Z ---------------------------------------------------------------------- 2022-08-17T12:59:55.0028937Z Ran 1 test in 3.446s 2022-08-17T12:59:55.0029105Z 2022-08-17T12:59:55.0029217Z OK (skipped=1) 2022-08-17T12:59:55.0029356Z 2022-08-17T12:59:55.0029485Z Generating XML reports... 2022-08-17T12:59:55.0063715Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817125951.xml 2022-08-17T12:59:56.7670716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:56.7671485Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:56.7672109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:56.7672594Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:56.9425636Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T12:59:56.9441957Z 2022-08-17T12:59:56.9442267Z Running tests... 2022-08-17T12:59:56.9442703Z ---------------------------------------------------------------------- 2022-08-17T12:59:58.4433044Z test_nccl_barrier_timeout_new_group (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T12:59:58.4619969Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30956 2022-08-17T12:59:58.4626722Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30957 2022-08-17T12:59:59.8530997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:59.8531483Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:59.8532253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:59.8532732Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T12:59:59.8937178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T12:59:59.8937614Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T12:59:59.8940648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T12:59:59.8941134Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:00.0191309Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:00:00.0595110Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:00:00.3679119Z skip: Need at least 4 CUDA devices (3.423s) 2022-08-17T13:00:00.3679413Z 2022-08-17T13:00:00.3679811Z ---------------------------------------------------------------------- 2022-08-17T13:00:00.3680134Z Ran 1 test in 3.424s 2022-08-17T13:00:00.3680298Z 2022-08-17T13:00:00.3680409Z OK (skipped=1) 2022-08-17T13:00:00.3680564Z 2022-08-17T13:00:00.3680692Z Generating XML reports... 2022-08-17T13:00:00.3716292Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817125956.xml 2022-08-17T13:00:02.1321953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:02.1322980Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:02.1324248Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:02.1325143Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:02.3129357Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:00:02.3146652Z 2022-08-17T13:00:02.3147121Z Running tests... 2022-08-17T13:00:02.3147626Z ---------------------------------------------------------------------- 2022-08-17T13:00:03.8339272Z test_nccl_barrier_timeout_new_group_non_member (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:00:03.8535797Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31059 2022-08-17T13:00:03.8541860Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31060 2022-08-17T13:00:05.2626380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:05.2627183Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:05.2627980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:05.2628459Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:05.2859627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:05.2860089Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:05.2862810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:05.2863287Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:05.4290953Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:00:05.4587015Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:00:05.7597577Z skip: Need at least 4 CUDA devices (3.445s) 2022-08-17T13:00:05.7597830Z 2022-08-17T13:00:05.7598437Z ---------------------------------------------------------------------- 2022-08-17T13:00:05.7599157Z Ran 1 test in 3.445s 2022-08-17T13:00:05.7599508Z 2022-08-17T13:00:05.7599724Z OK (skipped=1) 2022-08-17T13:00:05.7599944Z 2022-08-17T13:00:05.7600076Z Generating XML reports... 2022-08-17T13:00:05.7633348Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130002.xml 2022-08-17T13:00:07.5466283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:07.5466769Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:07.5467594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:07.5468087Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:07.7287885Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:00:07.7304136Z 2022-08-17T13:00:07.7304782Z Running tests... 2022-08-17T13:00:07.7305395Z ---------------------------------------------------------------------- 2022-08-17T13:00:09.2434325Z test_nccl_warn_not_in_group_debug_detail (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:00:09.2633474Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31162 2022-08-17T13:00:09.2639768Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31163 2022-08-17T13:00:10.7165992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:10.7166531Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:10.7167497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:10.7167966Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:10.7316322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:10.7317189Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:10.7319724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:10.7320472Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:10.8901536Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:00:10.9026717Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:00:10.9214718Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:00:10.9216110Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:00:10.9217283Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:00:10.9218522Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:00:10.9219074Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:00:10.9221949Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:00:10.9223198Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:00:10.9320711Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:00:12.7732170Z ok (5.042s) 2022-08-17T13:00:12.7732608Z 2022-08-17T13:00:12.7733361Z ---------------------------------------------------------------------- 2022-08-17T13:00:12.7733881Z Ran 1 test in 5.043s 2022-08-17T13:00:12.7734051Z 2022-08-17T13:00:12.7734129Z OK 2022-08-17T13:00:12.7734265Z 2022-08-17T13:00:12.7734402Z Generating XML reports... 2022-08-17T13:00:12.7769462Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130007.xml 2022-08-17T13:00:14.5658620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:14.5659162Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:14.5659954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:14.5660451Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:14.7433388Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:00:14.7449464Z 2022-08-17T13:00:14.7449610Z Running tests... 2022-08-17T13:00:14.7450401Z ---------------------------------------------------------------------- 2022-08-17T13:00:16.2689778Z test_nccl_warn_not_in_group_debug_info (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:00:16.2885132Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31293 2022-08-17T13:00:16.2891200Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31294 2022-08-17T13:00:17.7061856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:17.7062576Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:17.7063207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:17.7063965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:17.7219077Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:17.7219618Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:17.7222237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:17.7222698Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:17.8737405Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:00:17.8740952Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:00:17.8941498Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:00:17.8945657Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:00:17.8946645Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:00:17.8947550Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:00:17.8948083Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:00:17.8949436Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:00:17.8950218Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:00:17.9050542Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:00:19.6980025Z ok (4.953s) 2022-08-17T13:00:19.6980365Z 2022-08-17T13:00:19.6981007Z ---------------------------------------------------------------------- 2022-08-17T13:00:19.6981611Z Ran 1 test in 4.953s 2022-08-17T13:00:19.6981914Z 2022-08-17T13:00:19.6982074Z OK 2022-08-17T13:00:19.6982325Z 2022-08-17T13:00:19.6982536Z Generating XML reports... 2022-08-17T13:00:19.7017944Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130014.xml 2022-08-17T13:00:21.4464570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:21.4465085Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:21.4466527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:21.4467073Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:21.6192678Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:00:21.6207745Z 2022-08-17T13:00:21.6208169Z Running tests... 2022-08-17T13:00:21.6208660Z ---------------------------------------------------------------------- 2022-08-17T13:00:23.1010591Z test_nccl_warn_not_in_group_debug_off (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:00:23.1200537Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31415 2022-08-17T13:00:23.1206673Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31416 2022-08-17T13:00:24.5129283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:24.5130173Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:24.5131424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:24.5131930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:24.5340349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:24.5341055Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:24.5345827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:24.5346568Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:24.6801179Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:00:24.6804525Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:00:24.7043788Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:00:24.7047630Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:00:24.7049279Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:00:24.7050681Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:00:24.7110835Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:00:24.7111977Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:00:24.7113199Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:00:24.7154305Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:00:26.5296938Z ok (4.909s) 2022-08-17T13:00:26.5297280Z 2022-08-17T13:00:26.5298020Z ---------------------------------------------------------------------- 2022-08-17T13:00:26.5298582Z Ran 1 test in 4.909s 2022-08-17T13:00:26.5298754Z 2022-08-17T13:00:26.5298849Z OK 2022-08-17T13:00:26.5299005Z 2022-08-17T13:00:26.5299144Z Generating XML reports... 2022-08-17T13:00:26.5334830Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130021.xml 2022-08-17T13:00:28.3108115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:28.3108612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:28.3109384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:28.3109868Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:28.4861352Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:00:28.4877776Z 2022-08-17T13:00:28.4877976Z Running tests... 2022-08-17T13:00:28.4878420Z ---------------------------------------------------------------------- 2022-08-17T13:00:29.9875564Z test_pass_nccl_options_high_priority_stream (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:00:30.0071480Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31537 2022-08-17T13:00:30.0077512Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31538 2022-08-17T13:00:31.4417546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:31.4418045Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:31.4419086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:31.4419564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:31.4638756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:31.4639240Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:31.4641732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:31.4642213Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:31.6104144Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:00:31.6108291Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:00:31.6352405Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:00:31.6356118Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:00:31.6357141Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:00:31.6359823Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:00:31.6415708Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:00:31.6418064Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:00:31.6418844Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:00:31.6462572Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:00:34.4188970Z ok (5.931s) 2022-08-17T13:00:34.4189194Z 2022-08-17T13:00:34.4189586Z ---------------------------------------------------------------------- 2022-08-17T13:00:34.4190262Z Ran 1 test in 5.931s 2022-08-17T13:00:34.4190431Z 2022-08-17T13:00:34.4190530Z OK 2022-08-17T13:00:34.4190666Z 2022-08-17T13:00:34.4191836Z Generating XML reports... 2022-08-17T13:00:34.4225039Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130028.xml 2022-08-17T13:00:36.1516073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:36.1517076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:36.1518286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:36.1519200Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:36.3322891Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:00:36.3339792Z 2022-08-17T13:00:36.3340229Z Running tests... 2022-08-17T13:00:36.3340713Z ---------------------------------------------------------------------- 2022-08-17T13:00:37.8403994Z test_sequence_num_incremented_nccl_default (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:00:37.8600498Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31658 2022-08-17T13:00:37.8606837Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31659 2022-08-17T13:00:39.2697916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:39.2699214Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:39.2699869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:39.2700362Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:39.3556345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:39.3556858Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:39.3559375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:39.3559869Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:39.4374512Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:00:39.4384884Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:00:39.5288979Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:00:39.5300099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:00:39.5300887Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:00:39.5301584Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:00:39.5410378Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:00:39.5414931Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:00:39.5415721Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:00:39.5513162Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:00:41.3697489Z ok (5.035s) 2022-08-17T13:00:41.3697710Z 2022-08-17T13:00:41.3698107Z ---------------------------------------------------------------------- 2022-08-17T13:00:41.3698433Z Ran 1 test in 5.036s 2022-08-17T13:00:41.3698899Z 2022-08-17T13:00:41.3698994Z OK 2022-08-17T13:00:41.3699135Z 2022-08-17T13:00:41.3699271Z Generating XML reports... 2022-08-17T13:00:41.3733438Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130036.xml 2022-08-17T13:00:43.1444053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:43.1444573Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:43.1445332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:43.1445812Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:43.3198703Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:00:43.3214599Z 2022-08-17T13:00:43.3214805Z Running tests... 2022-08-17T13:00:43.3215222Z ---------------------------------------------------------------------- 2022-08-17T13:00:44.8300511Z test_sequence_num_incremented_nccl_subgroup (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:00:44.8496017Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31780 2022-08-17T13:00:44.8502005Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31781 2022-08-17T13:00:46.2419618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:46.2420396Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:46.2421590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:46.2422109Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:46.2767975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:46.2768448Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:46.2771248Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:46.2771740Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:46.4086748Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:00:46.4518129Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:00:46.7556091Z skip: Need at least 4 CUDA devices (3.434s) 2022-08-17T13:00:46.7556352Z 2022-08-17T13:00:46.7556765Z ---------------------------------------------------------------------- 2022-08-17T13:00:46.7557095Z Ran 1 test in 3.434s 2022-08-17T13:00:46.7557261Z 2022-08-17T13:00:46.7557375Z OK (skipped=1) 2022-08-17T13:00:46.7557531Z 2022-08-17T13:00:46.7557660Z Generating XML reports... 2022-08-17T13:00:46.7591896Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130043.xml 2022-08-17T13:00:48.5354311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:48.5355112Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:48.5356182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:48.5356704Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:48.7104823Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:00:48.7121050Z 2022-08-17T13:00:48.7121293Z Running tests... 2022-08-17T13:00:48.7121724Z ---------------------------------------------------------------------- 2022-08-17T13:00:50.2321785Z test_sequence_num_set_default_pg_nccl (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:00:50.2516972Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31883 2022-08-17T13:00:50.2522755Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31884 2022-08-17T13:00:51.6851609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:51.6852117Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:51.6853092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:51.6853576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:51.7349208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:51.7349664Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:51.7352351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:51.7352849Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:51.8515381Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:00:51.8525636Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:00:51.9079492Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:00:51.9090527Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:00:51.9091240Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:00:51.9137293Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:00:53.6610239Z ok (4.949s) 2022-08-17T13:00:53.6610518Z 2022-08-17T13:00:53.6610916Z ---------------------------------------------------------------------- 2022-08-17T13:00:53.6611257Z Ran 1 test in 4.949s 2022-08-17T13:00:53.6611425Z 2022-08-17T13:00:53.6611526Z OK 2022-08-17T13:00:53.6611645Z 2022-08-17T13:00:53.6611796Z Generating XML reports... 2022-08-17T13:00:53.6645706Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130048.xml 2022-08-17T13:00:55.4374582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:55.4375370Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:55.4376472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:55.4377009Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:55.6139533Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:00:55.6155965Z 2022-08-17T13:00:55.6156239Z Running tests... 2022-08-17T13:00:55.6156680Z ---------------------------------------------------------------------- 2022-08-17T13:00:57.1215353Z test_sequence_num_set_nccl_new_group (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:00:57.1411430Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31999 2022-08-17T13:00:57.1417567Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32000 2022-08-17T13:00:58.5301580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:58.5302604Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:58.5304080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:58.5305000Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:58.5810898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:00:58.5811846Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:00:58.5813362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:00:58.5814321Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:00:58.6978295Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:00:58.6989703Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:00:58.7523303Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:00:58.7534123Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:00:58.7535153Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:00:58.7537971Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:00:58.7603669Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:00:58.7606503Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:00:58.7607781Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:00:58.7641572Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:01:00.5517533Z ok (4.936s) 2022-08-17T13:01:00.5517753Z 2022-08-17T13:01:00.5518114Z ---------------------------------------------------------------------- 2022-08-17T13:01:00.5518486Z Ran 1 test in 4.936s 2022-08-17T13:01:00.5518654Z 2022-08-17T13:01:00.5518755Z OK 2022-08-17T13:01:00.5518890Z 2022-08-17T13:01:00.5519024Z Generating XML reports... 2022-08-17T13:01:00.5554458Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130055.xml 2022-08-17T13:01:02.3122786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:02.3123309Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:02.3124102Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:02.3124573Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:02.4850825Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:01:02.4866820Z 2022-08-17T13:01:02.4867073Z Running tests... 2022-08-17T13:01:02.4867501Z ---------------------------------------------------------------------- 2022-08-17T13:01:03.9517975Z test_accumulate_gradients_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:01:03.9705501Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32119 2022-08-17T13:01:03.9711713Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32120 2022-08-17T13:01:05.3581888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:05.3582437Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:05.3583233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:05.3584073Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:05.3860754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:05.3861631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:05.3864314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:05.3864808Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:05.5238267Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:01:05.5579122Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:01:06.7782162Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy58h6xsa 2022-08-17T13:01:06.7782756Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy58h6xsa/_remote_module_non_scriptable.py 2022-08-17T13:01:06.8333126Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1w1i0j52 2022-08-17T13:01:06.8334347Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1w1i0j52/_remote_module_non_scriptable.py 2022-08-17T13:01:08.3452258Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:01:08.3452854Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:01:08.3489518Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:01:08.3490068Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:01:08.3564076Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:01:08.3572886Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:01:08.8833873Z ok (6.396s) 2022-08-17T13:01:08.8834185Z 2022-08-17T13:01:08.8834657Z ---------------------------------------------------------------------- 2022-08-17T13:01:08.8835002Z Ran 1 test in 6.397s 2022-08-17T13:01:08.8835169Z 2022-08-17T13:01:08.8835265Z OK 2022-08-17T13:01:08.8835382Z 2022-08-17T13:01:08.8835523Z Generating XML reports... 2022-08-17T13:01:08.8871417Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130102.xml 2022-08-17T13:01:10.6384206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:10.6384696Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:10.6386480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:10.6386968Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:10.8158262Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:01:10.8175700Z 2022-08-17T13:01:10.8175932Z Running tests... 2022-08-17T13:01:10.8176374Z ---------------------------------------------------------------------- 2022-08-17T13:01:12.3175816Z test_accumulate_gradients_module_with_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:01:12.3486767Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32240 2022-08-17T13:01:12.3492854Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32241 2022-08-17T13:01:13.7889878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:13.7890449Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:13.7891054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:13.7891516Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:13.8711343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:13.8711839Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:13.8714552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:13.8715018Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:13.9605446Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:01:14.0429624Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:01:15.2455106Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnz19xfnh 2022-08-17T13:01:15.2455833Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnz19xfnh/_remote_module_non_scriptable.py 2022-08-17T13:01:15.3202393Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5qmay8bs 2022-08-17T13:01:15.3204910Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5qmay8bs/_remote_module_non_scriptable.py 2022-08-17T13:01:16.8420502Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:01:16.8421123Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:01:16.8439566Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:01:16.8440152Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:01:16.8550539Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:01:16.8558753Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:01:17.3618112Z ok (6.544s) 2022-08-17T13:01:17.3618361Z 2022-08-17T13:01:17.3618758Z ---------------------------------------------------------------------- 2022-08-17T13:01:17.3619102Z Ran 1 test in 6.544s 2022-08-17T13:01:17.3619275Z 2022-08-17T13:01:17.3619372Z OK 2022-08-17T13:01:17.3619507Z 2022-08-17T13:01:17.3619644Z Generating XML reports... 2022-08-17T13:01:17.3656771Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130110.xml 2022-08-17T13:01:19.1311548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:19.1312057Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:19.1312837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:19.1313320Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:19.3008582Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:01:19.3023491Z 2022-08-17T13:01:19.3023749Z Running tests... 2022-08-17T13:01:19.3024474Z ---------------------------------------------------------------------- 2022-08-17T13:01:20.7684412Z test_arbitrary_forward_return_value (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:01:20.7873307Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32361 2022-08-17T13:01:20.7878889Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32362 2022-08-17T13:01:22.2146239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:22.2146754Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:22.2147909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:22.2148682Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:22.2386980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:22.2387447Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:22.2390167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:22.2390644Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:22.3806463Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:01:22.4114309Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:01:23.6415076Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjfr81ubj 2022-08-17T13:01:23.6415694Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjfr81ubj/_remote_module_non_scriptable.py 2022-08-17T13:01:23.7036845Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi5ikk3pu 2022-08-17T13:01:23.7037663Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi5ikk3pu/_remote_module_non_scriptable.py 2022-08-17T13:01:24.6953075Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:01:24.6953676Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:01:24.6954353Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:01:24.6954899Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:01:25.1144843Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:01:25.1145369Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:01:25.6001804Z ok (6.297s) 2022-08-17T13:01:25.6002005Z 2022-08-17T13:01:25.6002407Z ---------------------------------------------------------------------- 2022-08-17T13:01:25.6002744Z Ran 1 test in 6.298s 2022-08-17T13:01:25.6002911Z 2022-08-17T13:01:25.6003005Z OK 2022-08-17T13:01:25.6003140Z 2022-08-17T13:01:25.6003255Z Generating XML reports... 2022-08-17T13:01:25.6039384Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130119.xml 2022-08-17T13:01:27.3844591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:27.3845084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:27.3846097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:27.3846597Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:27.5597046Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:01:27.5612676Z 2022-08-17T13:01:27.5613192Z Running tests... 2022-08-17T13:01:27.5613668Z ---------------------------------------------------------------------- 2022-08-17T13:01:29.0607301Z test_arbitrary_forward_return_value_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:01:29.0796462Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32482 2022-08-17T13:01:29.0802927Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32483 2022-08-17T13:01:30.5150891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:30.5151752Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:30.5152336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:30.5152824Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:30.6030250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:30.6030742Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:30.6032397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:30.6032884Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:30.6818658Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:01:30.7757829Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:01:31.9354366Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjp1m9j0f 2022-08-17T13:01:31.9354989Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjp1m9j0f/_remote_module_non_scriptable.py 2022-08-17T13:01:32.0500448Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgjrar5pz 2022-08-17T13:01:32.0501701Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgjrar5pz/_remote_module_non_scriptable.py 2022-08-17T13:01:33.1399356Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:01:33.1399956Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:01:33.1400663Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:01:33.1401227Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:01:33.5517509Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:01:33.5518065Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:01:34.0927811Z ok (6.531s) 2022-08-17T13:01:34.0928048Z 2022-08-17T13:01:34.0928720Z ---------------------------------------------------------------------- 2022-08-17T13:01:34.0929460Z Ran 1 test in 6.531s 2022-08-17T13:01:34.0929740Z 2022-08-17T13:01:34.0929848Z OK 2022-08-17T13:01:34.0929984Z 2022-08-17T13:01:34.0930118Z Generating XML reports... 2022-08-17T13:01:34.0965210Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130127.xml 2022-08-17T13:01:35.8270672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:35.8271192Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:35.8272856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:35.8273610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:36.0044379Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:01:36.0060866Z 2022-08-17T13:01:36.0061300Z Running tests... 2022-08-17T13:01:36.0061785Z ---------------------------------------------------------------------- 2022-08-17T13:01:37.5132592Z test_bf16_compress_wrapper_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:01:37.5330294Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32603 2022-08-17T13:01:37.5336175Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32604 2022-08-17T13:01:38.9631300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:38.9631842Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:38.9633363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:38.9633854Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:38.9653389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:38.9653851Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:38.9657161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:38.9657643Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:39.1361896Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:01:39.1363310Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:01:39.1410410Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:01:39.1413490Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:01:40.4058964Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwq6d0qsh 2022-08-17T13:01:40.4059598Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwq6d0qsh/_remote_module_non_scriptable.py 2022-08-17T13:01:40.4504573Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo37t_4on 2022-08-17T13:01:40.4506031Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo37t_4on/_remote_module_non_scriptable.py 2022-08-17T13:01:41.4580537Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:01:41.4581161Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:01:41.4581875Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:01:41.4582579Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:01:41.9450264Z ok (5.939s) 2022-08-17T13:01:41.9450599Z 2022-08-17T13:01:41.9451140Z ---------------------------------------------------------------------- 2022-08-17T13:01:41.9451484Z Ran 1 test in 5.939s 2022-08-17T13:01:41.9451633Z 2022-08-17T13:01:41.9451730Z OK 2022-08-17T13:01:41.9452170Z 2022-08-17T13:01:41.9452328Z Generating XML reports... 2022-08-17T13:01:41.9487264Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130136.xml 2022-08-17T13:01:43.6710015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:43.6711029Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:43.6712252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:43.6713225Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:43.8476547Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:01:43.8494005Z 2022-08-17T13:01:43.8494466Z Running tests... 2022-08-17T13:01:43.8494956Z ---------------------------------------------------------------------- 2022-08-17T13:01:45.3605997Z test_bf16_compress_wrapper_nccl (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:01:45.3801142Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32724 2022-08-17T13:01:45.3807435Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32725 2022-08-17T13:01:46.8308785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:46.8309284Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:46.8310058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:46.8310561Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:46.8382946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:46.8383417Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:46.8386410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:46.8386893Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:47.0031204Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:01:47.0033735Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:01:47.0042219Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:01:47.0045364Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:01:48.2838455Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7rnhp2s0 2022-08-17T13:01:48.2839535Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7rnhp2s0/_remote_module_non_scriptable.py 2022-08-17T13:01:48.3035971Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfxnk50br 2022-08-17T13:01:48.3038189Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfxnk50br/_remote_module_non_scriptable.py 2022-08-17T13:01:49.3856181Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:01:49.3857101Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:01:49.3857832Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:01:49.3858376Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:01:49.8920309Z ok (6.042s) 2022-08-17T13:01:49.8920522Z 2022-08-17T13:01:49.8920919Z ---------------------------------------------------------------------- 2022-08-17T13:01:49.8921262Z Ran 1 test in 6.043s 2022-08-17T13:01:49.8921434Z 2022-08-17T13:01:49.8921532Z OK 2022-08-17T13:01:49.8921669Z 2022-08-17T13:01:49.8921804Z Generating XML reports... 2022-08-17T13:01:49.8956852Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130143.xml 2022-08-17T13:01:51.6152331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:51.6152840Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:51.6153621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:51.6154084Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:51.7919386Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:01:51.7934921Z 2022-08-17T13:01:51.7935404Z Running tests... 2022-08-17T13:01:51.7935908Z ---------------------------------------------------------------------- 2022-08-17T13:01:53.2931156Z test_builtin_ddp_comm_hooks_nccl (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:01:53.3126049Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32845 2022-08-17T13:01:53.3132414Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32846 2022-08-17T13:01:54.7016796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:54.7017299Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:54.7018078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:54.7018557Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:54.7358869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:54.7359316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:54.7361979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:54.7362477Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:54.8672998Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:01:54.9093133Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:01:56.1222746Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5bt9ywg0 2022-08-17T13:01:56.1223670Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5bt9ywg0/_remote_module_non_scriptable.py 2022-08-17T13:01:56.1676882Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuul27aj3 2022-08-17T13:01:56.1677992Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuul27aj3/_remote_module_non_scriptable.py 2022-08-17T13:01:57.2596880Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:01:57.2597498Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:01:57.2598503Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:01:57.2599071Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:01:57.7246243Z ok (5.931s) 2022-08-17T13:01:57.7246445Z 2022-08-17T13:01:57.7246828Z ---------------------------------------------------------------------- 2022-08-17T13:01:57.7247186Z Ran 1 test in 5.931s 2022-08-17T13:01:57.7247353Z 2022-08-17T13:01:57.7247450Z OK 2022-08-17T13:01:57.7247568Z 2022-08-17T13:01:57.7247707Z Generating XML reports... 2022-08-17T13:01:57.7282844Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130151.xml 2022-08-17T13:01:59.5005436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:01:59.5005923Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:01:59.5006847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:01:59.5007330Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:01:59.6752287Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:01:59.6768285Z 2022-08-17T13:01:59.6768662Z Running tests... 2022-08-17T13:01:59.6769160Z ---------------------------------------------------------------------- 2022-08-17T13:02:01.1757766Z test_builtin_ddp_comm_hooks_nccl_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:02:01.1951340Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32966 2022-08-17T13:02:01.1957571Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32967 2022-08-17T13:02:02.6390547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:02.6391212Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:02.6391825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:02.6392306Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:02.6440775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:02.6441245Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:02.6444132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:02.6444627Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:02.8063055Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:02:02.8198430Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:02:04.0882429Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxh00ox6h 2022-08-17T13:02:04.0883028Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxh00ox6h/_remote_module_non_scriptable.py 2022-08-17T13:02:04.1292551Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0up905mn 2022-08-17T13:02:04.1294644Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0up905mn/_remote_module_non_scriptable.py 2022-08-17T13:02:05.2148118Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:02:05.2148743Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:02:05.2149768Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:02:05.2150343Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:02:05.7073131Z ok (6.030s) 2022-08-17T13:02:05.7073350Z 2022-08-17T13:02:05.7073740Z ---------------------------------------------------------------------- 2022-08-17T13:02:05.7074082Z Ran 1 test in 6.030s 2022-08-17T13:02:05.7074259Z 2022-08-17T13:02:05.7074336Z OK 2022-08-17T13:02:05.7076756Z 2022-08-17T13:02:05.7077165Z Generating XML reports... 2022-08-17T13:02:05.7108953Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130159.xml 2022-08-17T13:02:07.4512140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:07.4512936Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:07.4513534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:07.4514020Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:07.6202500Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:02:07.6217480Z 2022-08-17T13:02:07.6217673Z Running tests... 2022-08-17T13:02:07.6218106Z ---------------------------------------------------------------------- 2022-08-17T13:02:09.1073820Z test_channels_last_contig (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:02:09.1261579Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33087 2022-08-17T13:02:09.1268436Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33088 2022-08-17T13:02:10.5232305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:10.5232856Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:10.5234072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:10.5234555Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:10.5484824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:10.5485303Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:10.5488129Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:10.5488608Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:10.6905609Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:02:10.7221432Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:02:13.6384191Z ok (6.016s) 2022-08-17T13:02:13.6384411Z 2022-08-17T13:02:13.6384800Z ---------------------------------------------------------------------- 2022-08-17T13:02:13.6385146Z Ran 1 test in 6.017s 2022-08-17T13:02:13.6385321Z 2022-08-17T13:02:13.6385419Z OK 2022-08-17T13:02:13.6385554Z 2022-08-17T13:02:13.6385696Z Generating XML reports... 2022-08-17T13:02:13.6420977Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130207.xml 2022-08-17T13:02:15.4290856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:15.4291728Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:15.4292709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:15.4293174Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:15.6074846Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:02:15.6090800Z 2022-08-17T13:02:15.6091157Z Running tests... 2022-08-17T13:02:15.6091676Z ---------------------------------------------------------------------- 2022-08-17T13:02:15.6098597Z test_ddp_checkpointing_dynamic_module (__main__.DistributedDataParallelTest) 2022-08-17T13:02:17.1064256Z Dynamic module can be checkpointed, multiple times, with non-reentrant ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:02:17.1253329Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33204 2022-08-17T13:02:17.1259636Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33205 2022-08-17T13:02:18.4747798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:18.4748611Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:18.4749558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:18.4750038Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:18.5504969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:18.5505439Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:18.5508377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:18.5508857Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:18.6412696Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:02:18.7234824Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:02:19.8944153Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1us6geay 2022-08-17T13:02:19.8945022Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1us6geay/_remote_module_non_scriptable.py 2022-08-17T13:02:20.0009901Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe13j4rqt 2022-08-17T13:02:20.0011625Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe13j4rqt/_remote_module_non_scriptable.py 2022-08-17T13:02:20.4964961Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:02:20.4965557Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:02:20.5064040Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:02:20.5064636Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:02:21.0358519Z ok (5.426s) 2022-08-17T13:02:21.0358706Z 2022-08-17T13:02:21.0359088Z ---------------------------------------------------------------------- 2022-08-17T13:02:21.0359427Z Ran 1 test in 5.427s 2022-08-17T13:02:21.0359599Z 2022-08-17T13:02:21.0359675Z OK 2022-08-17T13:02:21.0359814Z 2022-08-17T13:02:21.0359947Z Generating XML reports... 2022-08-17T13:02:21.0395934Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130215.xml 2022-08-17T13:02:22.8177054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:22.8177553Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:22.8178386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:22.8178864Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:22.9953169Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:02:22.9968754Z 2022-08-17T13:02:22.9969007Z Running tests... 2022-08-17T13:02:22.9969439Z ---------------------------------------------------------------------- 2022-08-17T13:02:22.9976789Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.DistributedDataParallelTest) 2022-08-17T13:02:24.4941742Z Dynamic module can be checkpointed multiple times with weight sharing ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:02:24.5138598Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33324 2022-08-17T13:02:24.5145012Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33325 2022-08-17T13:02:25.9543138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:25.9544236Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:25.9544846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:25.9545329Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:25.9835756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:25.9836287Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:25.9836932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:25.9837413Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:26.1212195Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:02:26.1550670Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:02:27.3723721Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp16va42tm 2022-08-17T13:02:27.3724338Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp16va42tm/_remote_module_non_scriptable.py 2022-08-17T13:02:27.4303835Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpldxjr6n7 2022-08-17T13:02:27.4305214Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpldxjr6n7/_remote_module_non_scriptable.py 2022-08-17T13:02:27.9259005Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:02:27.9259618Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:02:27.9377339Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:02:27.9377934Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:02:28.4261458Z ok (5.429s) 2022-08-17T13:02:28.4261661Z 2022-08-17T13:02:28.4262068Z ---------------------------------------------------------------------- 2022-08-17T13:02:28.4262391Z Ran 1 test in 5.429s 2022-08-17T13:02:28.4262558Z 2022-08-17T13:02:28.4262653Z OK 2022-08-17T13:02:28.4262791Z 2022-08-17T13:02:28.4262928Z Generating XML reports... 2022-08-17T13:02:28.4299827Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130222.xml 2022-08-17T13:02:30.2084027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:30.2084509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:30.2085703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:30.2086191Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:30.3833150Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:02:30.3848448Z 2022-08-17T13:02:30.3848861Z Running tests... 2022-08-17T13:02:30.3849357Z ---------------------------------------------------------------------- 2022-08-17T13:02:30.3858081Z test_ddp_checkpointing_once_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T13:02:31.8910963Z DDP works as expected when layer is checkpointed only once. ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:02:31.9105931Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33444 2022-08-17T13:02:31.9112133Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33445 2022-08-17T13:02:33.2741033Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:33.2741662Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:33.2742293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:33.2742771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:33.3429345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:33.3429801Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:33.3432463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:33.3432949Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:33.4418089Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:02:33.5160272Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:02:34.6862118Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvkmuejkd 2022-08-17T13:02:34.6863029Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvkmuejkd/_remote_module_non_scriptable.py 2022-08-17T13:02:34.7788427Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpomzzpnwm 2022-08-17T13:02:34.7789525Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpomzzpnwm/_remote_module_non_scriptable.py 2022-08-17T13:02:35.2729972Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:02:35.2730584Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:02:35.2822658Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:02:35.2823228Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:02:35.2868435Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:35.2886371Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:35.3178853Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:35.3185123Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:35.3334518Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:02:35.3335369Z warnings.warn( 2022-08-17T13:02:35.3336703Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:02:35.3337454Z warnings.warn( 2022-08-17T13:02:35.3437363Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:35.3443919Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:35.3648395Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:35.3651368Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:35.3943828Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:35.3946474Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:35.4195569Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:35.4199213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:35.9227839Z ok (5.538s) 2022-08-17T13:02:35.9228028Z 2022-08-17T13:02:35.9228432Z ---------------------------------------------------------------------- 2022-08-17T13:02:35.9228775Z Ran 1 test in 5.538s 2022-08-17T13:02:35.9229000Z 2022-08-17T13:02:35.9229100Z OK 2022-08-17T13:02:35.9229244Z 2022-08-17T13:02:35.9229361Z Generating XML reports... 2022-08-17T13:02:35.9264465Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130230.xml 2022-08-17T13:02:37.6972835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:37.6973369Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:37.6974190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:37.6974655Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:37.8725778Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:02:37.8741962Z 2022-08-17T13:02:37.8742131Z Running tests... 2022-08-17T13:02:37.8742582Z ---------------------------------------------------------------------- 2022-08-17T13:02:37.8752086Z test_ddp_checkpointing_once_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T13:02:39.3810847Z DDP works as expected when layer is checkpointed only once. ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:02:39.4007575Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33564 2022-08-17T13:02:39.4013831Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33565 2022-08-17T13:02:40.7965433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:40.7966229Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:40.7966838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:40.7967308Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:40.8223654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:40.8224125Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:40.8227202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:40.8227684Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:40.9638632Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:02:40.9943918Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:02:42.2245482Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpewp_h3x6 2022-08-17T13:02:42.2246092Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpewp_h3x6/_remote_module_non_scriptable.py 2022-08-17T13:02:42.2620100Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5wlllo2j 2022-08-17T13:02:42.2622314Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5wlllo2j/_remote_module_non_scriptable.py 2022-08-17T13:02:42.7619219Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:02:42.7620151Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:02:42.7695052Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:02:42.7695617Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:02:42.7758406Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:42.7758890Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:42.8058493Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:42.8059020Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:42.8208500Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:02:42.8209360Z warnings.warn( 2022-08-17T13:02:42.8210413Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:02:42.8211100Z warnings.warn( 2022-08-17T13:02:42.8314399Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:42.8314926Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:42.8519349Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:42.8519852Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:42.8809574Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:42.8810074Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:42.9056995Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:42.9057468Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:43.4114165Z ok (5.537s) 2022-08-17T13:02:43.4114365Z 2022-08-17T13:02:43.4114773Z ---------------------------------------------------------------------- 2022-08-17T13:02:43.4115115Z Ran 1 test in 5.537s 2022-08-17T13:02:43.4115303Z 2022-08-17T13:02:43.4115381Z OK 2022-08-17T13:02:43.4115520Z 2022-08-17T13:02:43.4115658Z Generating XML reports... 2022-08-17T13:02:43.4151478Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130237.xml 2022-08-17T13:02:45.1911584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:45.1912503Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:45.1913381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:45.1913865Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:45.3659611Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:02:45.3676726Z 2022-08-17T13:02:45.3676873Z Running tests... 2022-08-17T13:02:45.3677590Z ---------------------------------------------------------------------- 2022-08-17T13:02:45.3685009Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T13:02:46.8730256Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:02:46.8925290Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33684 2022-08-17T13:02:46.8931362Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33685 2022-08-17T13:02:48.3300276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:48.3300866Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:48.3301844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:48.3302309Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:48.3818347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:48.3818813Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:48.3821484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:48.3821947Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:48.5029107Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:02:48.5514502Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:02:49.7893914Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnchdwh5l 2022-08-17T13:02:49.7894818Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnchdwh5l/_remote_module_non_scriptable.py 2022-08-17T13:02:49.7991737Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7fnqpaxm 2022-08-17T13:02:49.7994428Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7fnqpaxm/_remote_module_non_scriptable.py 2022-08-17T13:02:50.3180558Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:02:50.3181162Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:02:50.3196774Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:02:50.3197333Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:02:50.3323677Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:50.3324180Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:50.3622187Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:50.3622927Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:50.8030631Z ok (5.435s) 2022-08-17T13:02:50.8030873Z 2022-08-17T13:02:50.8031292Z ---------------------------------------------------------------------- 2022-08-17T13:02:50.8031646Z Ran 1 test in 5.435s 2022-08-17T13:02:50.8031816Z 2022-08-17T13:02:50.8031912Z OK 2022-08-17T13:02:50.8032028Z 2022-08-17T13:02:50.8032165Z Generating XML reports... 2022-08-17T13:02:50.8067429Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130245.xml 2022-08-17T13:02:52.5424224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:52.5424706Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:52.5426465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:52.5426946Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:52.7101374Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:02:52.7117419Z 2022-08-17T13:02:52.7117560Z Running tests... 2022-08-17T13:02:52.7117990Z ---------------------------------------------------------------------- 2022-08-17T13:02:52.7125463Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T13:02:54.1787775Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:02:54.1974667Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33804 2022-08-17T13:02:54.1980700Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33805 2022-08-17T13:02:55.6461030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:55.6461636Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:55.6462245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:55.6462723Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:55.6695784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:02:55.6696251Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:02:55.6699431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:02:55.8120520Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:02:55.8120982Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:02:55.8416992Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:02:57.0816796Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppmd_7cpc 2022-08-17T13:02:57.0817402Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppmd_7cpc/_remote_module_non_scriptable.py 2022-08-17T13:02:57.1445589Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppxg78a07 2022-08-17T13:02:57.1447414Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppxg78a07/_remote_module_non_scriptable.py 2022-08-17T13:02:57.6452989Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:02:57.6453602Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:02:57.6529226Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:02:57.6530077Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:02:57.6684377Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:57.6685169Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:57.7098916Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:57.7099432Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:02:58.2082215Z ok (5.496s) 2022-08-17T13:02:58.2082444Z 2022-08-17T13:02:58.2082831Z ---------------------------------------------------------------------- 2022-08-17T13:02:58.2083185Z Ran 1 test in 5.496s 2022-08-17T13:02:58.2083352Z 2022-08-17T13:02:58.2083751Z OK 2022-08-17T13:02:58.2083895Z 2022-08-17T13:02:58.2084035Z Generating XML reports... 2022-08-17T13:02:58.2119006Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130252.xml 2022-08-17T13:03:00.0109986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:00.0110704Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:00.0111586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:00.0112070Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:00.1883440Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:03:00.1899375Z 2022-08-17T13:03:00.1899525Z Running tests... 2022-08-17T13:03:00.1900316Z ---------------------------------------------------------------------- 2022-08-17T13:03:00.1910785Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T13:03:01.7054042Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:03:01.7250200Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33924 2022-08-17T13:03:01.7256543Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33925 2022-08-17T13:03:03.1205367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:03.1205875Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:03.1206479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:03.1206943Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:03.1466454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:03.1466942Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:03.1470367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:03.1470824Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:03.2874487Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:03:03.3200271Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:03:04.5469812Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqghh2jal 2022-08-17T13:03:04.5470425Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqghh2jal/_remote_module_non_scriptable.py 2022-08-17T13:03:04.5861632Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpshdn0kfb 2022-08-17T13:03:04.5864268Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpshdn0kfb/_remote_module_non_scriptable.py 2022-08-17T13:03:05.0837291Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:03:05.0837890Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:03:05.0920458Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:03:05.0921012Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:03:05.0977497Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:05.0994005Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:05.1242454Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:03:05.1243998Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:03:05.1600263Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:05.1601117Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:05.6357412Z ok (5.445s) 2022-08-17T13:03:05.6357732Z 2022-08-17T13:03:05.6358258Z ---------------------------------------------------------------------- 2022-08-17T13:03:05.6358592Z Ran 1 test in 5.446s 2022-08-17T13:03:05.6358760Z 2022-08-17T13:03:05.6358856Z OK 2022-08-17T13:03:05.6358996Z 2022-08-17T13:03:05.6359133Z Generating XML reports... 2022-08-17T13:03:05.6394414Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130300.xml 2022-08-17T13:03:07.4032433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:07.4032987Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:07.4033970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:07.4034555Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:07.5780677Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:03:07.5797852Z 2022-08-17T13:03:07.5798235Z Running tests... 2022-08-17T13:03:07.5798673Z ---------------------------------------------------------------------- 2022-08-17T13:03:07.5808721Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T13:03:09.0713650Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:03:09.0909034Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34044 2022-08-17T13:03:09.0915991Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34045 2022-08-17T13:03:10.4522040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:10.4523455Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:10.4524643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:10.4525596Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:10.5152495Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:10.5153403Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:10.5154875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:10.5155785Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:10.6201124Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:03:10.6876410Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:03:11.8782464Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4uphuklj 2022-08-17T13:03:11.8783959Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4uphuklj/_remote_module_non_scriptable.py 2022-08-17T13:03:11.9647232Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpeo_eir8r 2022-08-17T13:03:11.9648945Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpeo_eir8r/_remote_module_non_scriptable.py 2022-08-17T13:03:12.4680201Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:03:12.4681265Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:03:12.4778040Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:03:12.4778631Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:03:13.0014969Z ok (5.421s) 2022-08-17T13:03:13.0015153Z 2022-08-17T13:03:13.0015555Z ---------------------------------------------------------------------- 2022-08-17T13:03:13.0015892Z Ran 1 test in 5.422s 2022-08-17T13:03:13.0016065Z 2022-08-17T13:03:13.0016160Z OK 2022-08-17T13:03:13.0016300Z 2022-08-17T13:03:13.0016413Z Generating XML reports... 2022-08-17T13:03:13.0052323Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130307.xml 2022-08-17T13:03:14.7558950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:14.7559467Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:14.7560369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:14.7560872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:14.9257411Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:03:14.9273098Z 2022-08-17T13:03:14.9273551Z Running tests... 2022-08-17T13:03:14.9274062Z ---------------------------------------------------------------------- 2022-08-17T13:03:14.9280660Z test_ddp_checkpointing_twice_weight_sharing (__main__.DistributedDataParallelTest) 2022-08-17T13:03:16.3900375Z Checkpointing should work with static graph in the case of checkpointing ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:03:16.4089017Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34164 2022-08-17T13:03:16.4095338Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34165 2022-08-17T13:03:17.8018234Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:17.8019063Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:17.8020074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:17.8020581Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:17.8297782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:17.8298274Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:17.8301324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:17.8301813Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:17.9691752Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:03:18.0056970Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:03:19.2197226Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp_9sbh7a 2022-08-17T13:03:19.2197803Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp_9sbh7a/_remote_module_non_scriptable.py 2022-08-17T13:03:19.2847812Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0v09r9ev 2022-08-17T13:03:19.2849135Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0v09r9ev/_remote_module_non_scriptable.py 2022-08-17T13:03:19.7855341Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:03:19.7855985Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:03:19.7887819Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:03:19.7888415Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:03:19.8007436Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:19.8007940Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:19.8284930Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:19.8285453Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:20.3194084Z ok (5.392s) 2022-08-17T13:03:20.3194427Z 2022-08-17T13:03:20.3194873Z ---------------------------------------------------------------------- 2022-08-17T13:03:20.3195205Z Ran 1 test in 5.392s 2022-08-17T13:03:20.3195393Z 2022-08-17T13:03:20.3195490Z OK 2022-08-17T13:03:20.3195630Z 2022-08-17T13:03:20.3196506Z Generating XML reports... 2022-08-17T13:03:20.3231650Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130314.xml 2022-08-17T13:03:22.0638737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:22.0639250Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:22.0640405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:22.0640974Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:22.2399086Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:03:22.2415269Z 2022-08-17T13:03:22.2415409Z Running tests... 2022-08-17T13:03:22.2416063Z ---------------------------------------------------------------------- 2022-08-17T13:03:22.2427008Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T13:03:23.7579459Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:03:23.7775279Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34284 2022-08-17T13:03:23.7781586Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34285 2022-08-17T13:03:25.1232238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:25.1233202Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:25.1234400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:25.1235296Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:25.2050510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:25.2051450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:25.2053716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:25.2054663Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:25.2898552Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:03:25.3810317Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:03:26.5639107Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9iuz4t_b 2022-08-17T13:03:26.5639736Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9iuz4t_b/_remote_module_non_scriptable.py 2022-08-17T13:03:26.6444135Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx8cgutek 2022-08-17T13:03:26.6445344Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx8cgutek/_remote_module_non_scriptable.py 2022-08-17T13:03:27.1569798Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:03:27.1570413Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:03:27.1571639Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:03:27.1637280Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:03:27.1637866Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:03:27.1641772Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:03:27.1900360Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:03:27.1901152Z warnings.warn( 2022-08-17T13:03:27.1902213Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:03:27.1903013Z warnings.warn( 2022-08-17T13:03:27.2006010Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:27.2006533Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:27.2508320Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:27.2508827Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:27.6882195Z ok (5.446s) 2022-08-17T13:03:27.6882395Z 2022-08-17T13:03:27.6882780Z ---------------------------------------------------------------------- 2022-08-17T13:03:27.6883139Z Ran 1 test in 5.447s 2022-08-17T13:03:27.6883311Z 2022-08-17T13:03:27.6883386Z OK 2022-08-17T13:03:27.6883522Z 2022-08-17T13:03:27.6883655Z Generating XML reports... 2022-08-17T13:03:27.6918526Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130322.xml 2022-08-17T13:03:29.4179765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:29.4180270Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:29.4181533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:29.4182014Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:29.5942245Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:03:29.5958414Z 2022-08-17T13:03:29.5958575Z Running tests... 2022-08-17T13:03:29.5959022Z ---------------------------------------------------------------------- 2022-08-17T13:03:29.5969756Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T13:03:31.1075762Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:03:31.1263384Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34404 2022-08-17T13:03:31.1270135Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34405 2022-08-17T13:03:32.5299497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:32.5300477Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:32.5301697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:32.5302575Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:32.5564452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:32.5565393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:32.5567810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:32.5568741Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:32.6971475Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:03:32.7278538Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:03:33.9886938Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8v46g_0n 2022-08-17T13:03:33.9887586Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8v46g_0n/_remote_module_non_scriptable.py 2022-08-17T13:03:34.0105998Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkmnp2nrp 2022-08-17T13:03:34.0109012Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkmnp2nrp/_remote_module_non_scriptable.py 2022-08-17T13:03:34.5150678Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:03:34.5151266Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:03:34.5160030Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:03:34.5160590Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:03:34.5296455Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:03:34.5297195Z warnings.warn( 2022-08-17T13:03:34.5298248Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:03:34.5298964Z warnings.warn( 2022-08-17T13:03:34.5419568Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:34.5420081Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:34.5811582Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:34.5812080Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:35.0370500Z ok (5.441s) 2022-08-17T13:03:35.0370687Z 2022-08-17T13:03:35.0371083Z ---------------------------------------------------------------------- 2022-08-17T13:03:35.0371404Z Ran 1 test in 5.441s 2022-08-17T13:03:35.0371573Z 2022-08-17T13:03:35.0371669Z OK 2022-08-17T13:03:35.0371806Z 2022-08-17T13:03:35.0371940Z Generating XML reports... 2022-08-17T13:03:35.0406865Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130329.xml 2022-08-17T13:03:36.7995767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:36.7996269Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:36.7997896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:36.7998377Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:36.9764656Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:03:36.9781153Z 2022-08-17T13:03:36.9781293Z Running tests... 2022-08-17T13:03:36.9781708Z ---------------------------------------------------------------------- 2022-08-17T13:03:36.9794375Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T13:03:38.4977660Z Test that checkpointing with weight sharing works. ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:03:38.5172739Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34524 2022-08-17T13:03:38.5179224Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34525 2022-08-17T13:03:39.9294585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:39.9295101Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:39.9295875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:39.9296352Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:39.9604468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:39.9605280Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:39.9607957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:39.9608444Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:40.0963863Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:03:40.1333539Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:03:41.3586317Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq0diwp9x 2022-08-17T13:03:41.3587142Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq0diwp9x/_remote_module_non_scriptable.py 2022-08-17T13:03:41.4126582Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp88w43lj3 2022-08-17T13:03:41.4127980Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp88w43lj3/_remote_module_non_scriptable.py 2022-08-17T13:03:41.9145997Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:03:41.9146607Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:03:41.9227806Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:03:41.9228370Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:03:41.9289334Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:41.9289819Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:41.9634261Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:41.9634764Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:42.4278792Z ok (5.449s) 2022-08-17T13:03:42.4279003Z 2022-08-17T13:03:42.4279396Z ---------------------------------------------------------------------- 2022-08-17T13:03:42.4279758Z Ran 1 test in 5.450s 2022-08-17T13:03:42.4279931Z 2022-08-17T13:03:42.4280027Z OK 2022-08-17T13:03:42.4280165Z 2022-08-17T13:03:42.4280300Z Generating XML reports... 2022-08-17T13:03:42.4315584Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130336.xml 2022-08-17T13:03:44.2014823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:44.2015330Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:44.2016403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:44.2016911Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:44.3782772Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:03:44.3799816Z 2022-08-17T13:03:44.3800381Z Running tests... 2022-08-17T13:03:44.3800860Z ---------------------------------------------------------------------- 2022-08-17T13:03:44.3812650Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T13:03:45.9048587Z Test that checkpointing with weight sharing works. ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:03:45.9245375Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34644 2022-08-17T13:03:45.9251606Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34645 2022-08-17T13:03:47.3146784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:47.3147708Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:47.3148695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:47.3149297Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:47.3155000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:47.3155454Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:47.3158576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:47.3159058Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:47.4861594Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:03:47.4862094Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:03:48.7524605Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd1zny8rb 2022-08-17T13:03:48.7525725Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd1zny8rb/_remote_module_non_scriptable.py 2022-08-17T13:03:48.7733312Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp75hv2e8u 2022-08-17T13:03:48.7735615Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp75hv2e8u/_remote_module_non_scriptable.py 2022-08-17T13:03:49.2789157Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:03:49.2790281Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:03:49.2791396Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:03:49.2791958Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:03:49.2856825Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:49.2857355Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:49.3139276Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:49.3139780Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:49.3332140Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:49.3332661Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:49.3609011Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:49.3609544Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:03:49.8351964Z ok (5.455s) 2022-08-17T13:03:49.8352195Z 2022-08-17T13:03:49.8352771Z ---------------------------------------------------------------------- 2022-08-17T13:03:49.8353120Z Ran 1 test in 5.455s 2022-08-17T13:03:49.8353290Z 2022-08-17T13:03:49.8353627Z OK 2022-08-17T13:03:49.8353782Z 2022-08-17T13:03:49.8353899Z Generating XML reports... 2022-08-17T13:03:49.8388171Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130344.xml 2022-08-17T13:03:51.6346890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:51.6347393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:51.6349389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:51.6349884Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:51.8108037Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:03:51.8124204Z 2022-08-17T13:03:51.8124362Z Running tests... 2022-08-17T13:03:51.8125075Z ---------------------------------------------------------------------- 2022-08-17T13:03:53.3172444Z test_ddp_comm_hook_allreduce_hook_nccl (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:03:53.3367544Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34764 2022-08-17T13:03:53.3373796Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34765 2022-08-17T13:03:54.7469916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:54.7470438Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:54.7471500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:54.7472006Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:54.7844845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:54.7845316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:54.7848603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:54.7849082Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:54.9118985Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:03:54.9513306Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:03:56.1602606Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3_n7ggsy 2022-08-17T13:03:56.1603259Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3_n7ggsy/_remote_module_non_scriptable.py 2022-08-17T13:03:56.2135202Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_w8yzl1z 2022-08-17T13:03:56.2136424Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_w8yzl1z/_remote_module_non_scriptable.py 2022-08-17T13:03:57.2319594Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:03:57.2320339Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:03:57.2321058Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:03:57.2321610Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:03:57.7488629Z ok (5.936s) 2022-08-17T13:03:57.7489066Z 2022-08-17T13:03:57.7489816Z ---------------------------------------------------------------------- 2022-08-17T13:03:57.7490180Z Ran 1 test in 5.936s 2022-08-17T13:03:57.7490349Z 2022-08-17T13:03:57.7490447Z OK 2022-08-17T13:03:57.7490581Z 2022-08-17T13:03:57.7490957Z Generating XML reports... 2022-08-17T13:03:57.7526972Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130351.xml 2022-08-17T13:03:59.5398927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:03:59.5399422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:03:59.5401594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:03:59.5402084Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:03:59.7153608Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:03:59.7170602Z 2022-08-17T13:03:59.7171180Z Running tests... 2022-08-17T13:03:59.7171714Z ---------------------------------------------------------------------- 2022-08-17T13:04:01.2211146Z test_ddp_comm_hook_allreduce_hook_nccl_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:04:01.2402302Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34885 2022-08-17T13:04:01.2408130Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34886 2022-08-17T13:04:02.6426397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:02.6426916Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:02.6427698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:02.6428155Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:02.6701884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:02.6702358Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:02.6705658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:02.6706141Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:02.8079656Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:04:02.8422471Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:04:04.0636405Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0f3lq9kh 2022-08-17T13:04:04.0637039Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0f3lq9kh/_remote_module_non_scriptable.py 2022-08-17T13:04:04.1022435Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjyfexb0x 2022-08-17T13:04:04.1024883Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjyfexb0x/_remote_module_non_scriptable.py 2022-08-17T13:04:05.1932391Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:04:05.1932993Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:04:05.1933709Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:04:05.1934248Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:04:05.6519302Z ok (5.935s) 2022-08-17T13:04:05.6519497Z 2022-08-17T13:04:05.6519882Z ---------------------------------------------------------------------- 2022-08-17T13:04:05.6520257Z Ran 1 test in 5.935s 2022-08-17T13:04:05.6520425Z 2022-08-17T13:04:05.6520504Z OK 2022-08-17T13:04:05.6520638Z 2022-08-17T13:04:05.6520773Z Generating XML reports... 2022-08-17T13:04:05.6555426Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130359.xml 2022-08-17T13:04:07.4323650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:07.4324151Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:07.4325342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:07.4325807Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:07.6072617Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:04:07.6088506Z 2022-08-17T13:04:07.6088731Z Running tests... 2022-08-17T13:04:07.6089377Z ---------------------------------------------------------------------- 2022-08-17T13:04:09.1080814Z test_ddp_comm_hook_allreduce_hook_nccl_static_graph (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:04:09.1267440Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35006 2022-08-17T13:04:09.1273384Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35007 2022-08-17T13:04:10.5335007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:10.5335497Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:10.5336395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:10.5336874Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:10.5693459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:10.5693921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:10.5697019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:10.5697490Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:10.6983543Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:04:10.7454664Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:04:11.9439113Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6go28wyl 2022-08-17T13:04:11.9439975Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6go28wyl/_remote_module_non_scriptable.py 2022-08-17T13:04:12.0157163Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzn186uvd 2022-08-17T13:04:12.0158185Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzn186uvd/_remote_module_non_scriptable.py 2022-08-17T13:04:13.0958297Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:04:13.0958903Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:04:13.0959606Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:04:13.0960142Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:04:13.5386505Z ok (5.929s) 2022-08-17T13:04:13.5386701Z 2022-08-17T13:04:13.5387284Z ---------------------------------------------------------------------- 2022-08-17T13:04:13.5387663Z Ran 1 test in 5.930s 2022-08-17T13:04:13.5387833Z 2022-08-17T13:04:13.5387948Z OK 2022-08-17T13:04:13.5388086Z 2022-08-17T13:04:13.5388220Z Generating XML reports... 2022-08-17T13:04:13.5423159Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130407.xml 2022-08-17T13:04:15.3137753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:15.3138498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:15.3139851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:15.3140351Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:15.4895119Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:04:15.4911763Z 2022-08-17T13:04:15.4912145Z Running tests... 2022-08-17T13:04:15.4912582Z ---------------------------------------------------------------------- 2022-08-17T13:04:15.4924711Z test_ddp_comm_hook_allreduce_with_then_hook_nccl (__main__.DistributedDataParallelTest) 2022-08-17T13:04:16.9965435Z This unit test verifies whether a DDP communication hook that calls allreduce and then ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:04:17.0151898Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35127 2022-08-17T13:04:17.0158214Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35128 2022-08-17T13:04:18.4637159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:18.4637665Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:18.4638621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:18.4639080Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:18.4720281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:18.4720743Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:18.4723587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:18.4724045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:18.6345452Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:04:18.6360914Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:04:19.9088553Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfzo_24ji 2022-08-17T13:04:19.9089152Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfzo_24ji/_remote_module_non_scriptable.py 2022-08-17T13:04:19.9531180Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplk66nhw0 2022-08-17T13:04:19.9532515Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplk66nhw0/_remote_module_non_scriptable.py 2022-08-17T13:04:20.9694042Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:04:20.9694657Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:04:20.9695344Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:04:20.9695889Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:04:21.4268175Z ok (5.935s) 2022-08-17T13:04:21.4268386Z 2022-08-17T13:04:21.4268745Z ---------------------------------------------------------------------- 2022-08-17T13:04:21.4269090Z Ran 1 test in 5.936s 2022-08-17T13:04:21.4269278Z 2022-08-17T13:04:21.4271802Z OK 2022-08-17T13:04:21.4272001Z 2022-08-17T13:04:21.4272148Z Generating XML reports... 2022-08-17T13:04:21.4305301Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130415.xml 2022-08-17T13:04:23.2138876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:23.2139389Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:23.2139990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:23.2140663Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:23.3889675Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:04:23.3905814Z 2022-08-17T13:04:23.3905938Z Running tests... 2022-08-17T13:04:23.3906609Z ---------------------------------------------------------------------- 2022-08-17T13:04:23.3914345Z test_ddp_comm_hook_future_passing_gpu_nccl (__main__.DistributedDataParallelTest) 2022-08-17T13:04:24.9085864Z This unit test verifies whether the Future object is passed properly using nccl backend. ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:04:24.9280515Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35248 2022-08-17T13:04:24.9286474Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35249 2022-08-17T13:04:26.3513593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:26.3514194Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:26.3515317Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:26.3515871Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:26.3673599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:26.3674289Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:26.3676928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:26.3677618Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:26.5227838Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:04:26.5357833Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:04:27.8123233Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy931qbcg 2022-08-17T13:04:27.8124051Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy931qbcg/_remote_module_non_scriptable.py 2022-08-17T13:04:27.8185301Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvked1i_8 2022-08-17T13:04:27.8188157Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvked1i_8/_remote_module_non_scriptable.py 2022-08-17T13:04:28.9170412Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:04:28.9171028Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:04:28.9171735Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:04:28.9172260Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:04:29.4400752Z ok (6.049s) 2022-08-17T13:04:29.4400962Z 2022-08-17T13:04:29.4401360Z ---------------------------------------------------------------------- 2022-08-17T13:04:29.4401708Z Ran 1 test in 6.049s 2022-08-17T13:04:29.4401877Z 2022-08-17T13:04:29.4401970Z OK 2022-08-17T13:04:29.4402108Z 2022-08-17T13:04:29.4402246Z Generating XML reports... 2022-08-17T13:04:29.4438103Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130423.xml 2022-08-17T13:04:31.2162720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:31.2163238Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:31.2164044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:31.2164530Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:31.3922582Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:04:31.3938933Z 2022-08-17T13:04:31.3939076Z Running tests... 2022-08-17T13:04:31.3939891Z ---------------------------------------------------------------------- 2022-08-17T13:04:32.9048777Z test_ddp_multi_device_module_config (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:04:32.9245179Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35369 2022-08-17T13:04:32.9251169Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35370 2022-08-17T13:04:34.3177154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:34.3177673Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:34.3178599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:34.3308350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:34.3308974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:34.3309439Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:34.3311706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:34.3312182Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:34.4830786Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:04:34.4972478Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:04:34.8304321Z skip: Need at least 4 CUDA devices (3.436s) 2022-08-17T13:04:34.8304585Z 2022-08-17T13:04:34.8304976Z ---------------------------------------------------------------------- 2022-08-17T13:04:34.8305317Z Ran 1 test in 3.436s 2022-08-17T13:04:34.8305486Z 2022-08-17T13:04:34.8305580Z OK (skipped=1) 2022-08-17T13:04:34.8305740Z 2022-08-17T13:04:34.8305898Z Generating XML reports... 2022-08-17T13:04:34.8343050Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130431.xml 2022-08-17T13:04:36.5697186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:36.5697698Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:36.5698792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:36.5699277Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:36.7388017Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:04:36.7403311Z 2022-08-17T13:04:36.7403564Z Running tests... 2022-08-17T13:04:36.7403995Z ---------------------------------------------------------------------- 2022-08-17T13:04:38.2078911Z test_ddp_weight_sharing (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:04:38.2266000Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35472 2022-08-17T13:04:38.2272255Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35473 2022-08-17T13:04:39.6244297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:39.6244807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:39.6245385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:39.6245847Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:39.6246412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:39.6246886Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:39.6247793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:39.6248246Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:39.7973210Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:04:39.7978762Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:04:41.1244000Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp33fdi3f8 2022-08-17T13:04:41.1244640Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp33fdi3f8/_remote_module_non_scriptable.py 2022-08-17T13:04:41.1321140Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpez424i8i 2022-08-17T13:04:41.1323811Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpez424i8i/_remote_module_non_scriptable.py 2022-08-17T13:04:42.1922941Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:04:42.1923561Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:04:42.1924265Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:04:42.1924810Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:04:42.2097500Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:04:42.2107305Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:04:42.2666643Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:04:42.2675965Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:04:42.3198663Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:04:42.3203526Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:04:42.3711442Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:04:42.3718861Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:04:42.9388962Z ok (6.198s) 2022-08-17T13:04:42.9390639Z 2022-08-17T13:04:42.9391363Z ---------------------------------------------------------------------- 2022-08-17T13:04:42.9391732Z Ran 1 test in 6.198s 2022-08-17T13:04:42.9391907Z 2022-08-17T13:04:42.9392012Z OK 2022-08-17T13:04:42.9392131Z 2022-08-17T13:04:42.9392272Z Generating XML reports... 2022-08-17T13:04:42.9424378Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130436.xml 2022-08-17T13:04:44.6894150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:44.6894651Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:44.6896019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:44.6896559Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:44.8683133Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:04:44.8699622Z 2022-08-17T13:04:44.8699880Z Running tests... 2022-08-17T13:04:44.8700325Z ---------------------------------------------------------------------- 2022-08-17T13:04:46.3678451Z test_ddp_with_lazy_parameters (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:04:46.3873036Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35593 2022-08-17T13:04:46.3879476Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35594 2022-08-17T13:04:47.8095023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:47.8095526Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:47.8096783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:47.8097257Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:47.8248521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:47.8248976Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:47.8251747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:47.8252232Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:47.9821405Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:04:47.9830307Z /opt/conda/lib/python3.10/site-packages/torch/nn/modules/lazy.py:180: UserWarning: Lazy modules are a new feature under heavy development so changes to the API or functionality can happen at any moment. 2022-08-17T13:04:47.9830995Z warnings.warn('Lazy modules are a new feature under heavy development ' 2022-08-17T13:04:47.9927143Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsmq74v5f 2022-08-17T13:04:47.9930130Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsmq74v5f/_remote_module_non_scriptable.py 2022-08-17T13:04:47.9947958Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:04:47.9956323Z /opt/conda/lib/python3.10/site-packages/torch/nn/modules/lazy.py:180: UserWarning: Lazy modules are a new feature under heavy development so changes to the API or functionality can happen at any moment. 2022-08-17T13:04:47.9957002Z warnings.warn('Lazy modules are a new feature under heavy development ' 2022-08-17T13:04:48.0051372Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpen4nsd2r 2022-08-17T13:04:48.0054015Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpen4nsd2r/_remote_module_non_scriptable.py 2022-08-17T13:04:48.2931629Z ok (3.423s) 2022-08-17T13:04:48.2932013Z 2022-08-17T13:04:48.2932492Z ---------------------------------------------------------------------- 2022-08-17T13:04:48.2932852Z Ran 1 test in 3.423s 2022-08-17T13:04:48.2933022Z 2022-08-17T13:04:48.2933116Z OK 2022-08-17T13:04:48.2933233Z 2022-08-17T13:04:48.2933365Z Generating XML reports... 2022-08-17T13:04:48.2968985Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130444.xml 2022-08-17T13:04:50.0689412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:50.0689930Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:50.0691506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:50.0691999Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:50.2455595Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:04:50.2471608Z 2022-08-17T13:04:50.2472112Z Running tests... 2022-08-17T13:04:50.2472606Z ---------------------------------------------------------------------- 2022-08-17T13:04:51.7403543Z test_default_ddp_comm_hooks_nccl (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:04:51.7597783Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35700 2022-08-17T13:04:51.7604256Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35701 2022-08-17T13:04:53.2270069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:53.2270592Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:53.2271370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:53.2271836Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:53.2309367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:53.2309829Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:53.2312304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:53.2312768Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:53.3988313Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:04:53.4002997Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:04:54.6705875Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcsrr1pcf 2022-08-17T13:04:54.6706815Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcsrr1pcf/_remote_module_non_scriptable.py 2022-08-17T13:04:54.6870405Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2wgv2lpq 2022-08-17T13:04:54.6873051Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2wgv2lpq/_remote_module_non_scriptable.py 2022-08-17T13:04:55.7893923Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:04:55.7894538Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:04:55.7895255Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:04:55.7895810Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:04:56.2717966Z ok (6.024s) 2022-08-17T13:04:56.2718182Z 2022-08-17T13:04:56.2718588Z ---------------------------------------------------------------------- 2022-08-17T13:04:56.2718911Z Ran 1 test in 6.025s 2022-08-17T13:04:56.2719080Z 2022-08-17T13:04:56.2719172Z OK 2022-08-17T13:04:56.2719312Z 2022-08-17T13:04:56.2719451Z Generating XML reports... 2022-08-17T13:04:56.2756899Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130450.xml 2022-08-17T13:04:58.0344363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:04:58.0344885Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:04:58.0346857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:04:58.0347605Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:04:58.2100150Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:04:58.2116376Z 2022-08-17T13:04:58.2116578Z Running tests... 2022-08-17T13:04:58.2117003Z ---------------------------------------------------------------------- 2022-08-17T13:04:59.7165937Z test_default_ddp_comm_hooks_nccl_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:04:59.7364385Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35821 2022-08-17T13:04:59.7370280Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35822 2022-08-17T13:05:01.1952798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:01.1953296Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:01.1954144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:01.1954627Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:01.2436544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:01.2437009Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:01.2438656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:01.2439135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:01.3619291Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:05:01.4213286Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:05:02.6221613Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxo1ih4aj 2022-08-17T13:05:02.6222786Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxo1ih4aj/_remote_module_non_scriptable.py 2022-08-17T13:05:02.7047499Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgfw14s_s 2022-08-17T13:05:02.7048493Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgfw14s_s/_remote_module_non_scriptable.py 2022-08-17T13:05:03.6842779Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:05:03.6843381Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:05:03.6844059Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:05:03.6844626Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:05:04.1497585Z ok (5.938s) 2022-08-17T13:05:04.1497793Z 2022-08-17T13:05:04.1498190Z ---------------------------------------------------------------------- 2022-08-17T13:05:04.1498529Z Ran 1 test in 5.938s 2022-08-17T13:05:04.1498677Z 2022-08-17T13:05:04.1498774Z OK 2022-08-17T13:05:04.1498913Z 2022-08-17T13:05:04.1499050Z Generating XML reports... 2022-08-17T13:05:04.1534771Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130458.xml 2022-08-17T13:05:05.8776497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:05.8777002Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:05.8778162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:05.8778625Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:06.0466247Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:05:06.0481627Z 2022-08-17T13:05:06.0481912Z Running tests... 2022-08-17T13:05:06.0482330Z ---------------------------------------------------------------------- 2022-08-17T13:05:07.5183131Z test_failure_recovery (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:05:07.5371275Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35942 2022-08-17T13:05:07.5377354Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35943 2022-08-17T13:05:08.9830182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:08.9831021Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:08.9832047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:08.9832568Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:09.0200463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:09.0200930Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:09.0203205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:09.0203686Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:09.1481445Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:05:09.1932247Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:05:10.4098941Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq259kz87 2022-08-17T13:05:10.4099565Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq259kz87/_remote_module_non_scriptable.py 2022-08-17T13:05:10.4820515Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu9sbyg55 2022-08-17T13:05:10.4821702Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu9sbyg55/_remote_module_non_scriptable.py 2022-08-17T13:05:11.4769883Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:05:11.4770467Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:05:11.4771191Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:05:11.4771770Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:05:11.8937487Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:05:11.8938051Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:05:11.9796323Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:05:11.9796847Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:05:12.4504663Z ok (6.402s) 2022-08-17T13:05:12.4504880Z 2022-08-17T13:05:12.4505269Z ---------------------------------------------------------------------- 2022-08-17T13:05:12.4505609Z Ran 1 test in 6.402s 2022-08-17T13:05:12.4505756Z 2022-08-17T13:05:12.4505857Z OK 2022-08-17T13:05:12.4505992Z 2022-08-17T13:05:12.4506130Z Generating XML reports... 2022-08-17T13:05:12.4563896Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130506.xml 2022-08-17T13:05:14.2587002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:14.2587801Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:14.2588857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:14.2589316Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:14.4355911Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:05:14.4373097Z 2022-08-17T13:05:14.4373583Z Running tests... 2022-08-17T13:05:14.4374107Z ---------------------------------------------------------------------- 2022-08-17T13:05:15.9527423Z test_find_unused_parameters_kwarg_debug_detail (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:05:15.9694600Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82632 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.532s) 2022-08-17T13:05:15.9695159Z 2022-08-17T13:05:15.9695441Z ---------------------------------------------------------------------- 2022-08-17T13:05:15.9695771Z Ran 1 test in 1.532s 2022-08-17T13:05:15.9695933Z 2022-08-17T13:05:15.9696040Z OK (skipped=1) 2022-08-17T13:05:15.9696194Z 2022-08-17T13:05:15.9696304Z Generating XML reports... 2022-08-17T13:05:15.9734137Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130514.xml 2022-08-17T13:05:17.7385459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:17.7385997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:17.7387559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:17.7388059Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:17.9156384Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:05:17.9179361Z 2022-08-17T13:05:17.9179760Z Running tests... 2022-08-17T13:05:17.9180242Z ---------------------------------------------------------------------- 2022-08-17T13:05:19.4119272Z test_find_unused_parameters_kwarg_debug_info (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:05:19.4280473Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/83301 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.510s) 2022-08-17T13:05:19.4281046Z 2022-08-17T13:05:19.4281322Z ---------------------------------------------------------------------- 2022-08-17T13:05:19.4281653Z Ran 1 test in 1.510s 2022-08-17T13:05:19.4281825Z 2022-08-17T13:05:19.4281935Z OK (skipped=1) 2022-08-17T13:05:19.4283632Z 2022-08-17T13:05:19.4284107Z Generating XML reports... 2022-08-17T13:05:19.4314065Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130517.xml 2022-08-17T13:05:21.1525169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:21.1525693Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:21.1526493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:21.1526963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:21.3224702Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:05:21.3240837Z 2022-08-17T13:05:21.3241110Z Running tests... 2022-08-17T13:05:21.3241836Z ---------------------------------------------------------------------- 2022-08-17T13:05:22.7838569Z test_find_unused_parameters_kwarg_debug_off (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:05:22.8000031Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82385 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.476s) 2022-08-17T13:05:22.8000593Z 2022-08-17T13:05:22.8000872Z ---------------------------------------------------------------------- 2022-08-17T13:05:22.8001186Z Ran 1 test in 1.476s 2022-08-17T13:05:22.8001668Z 2022-08-17T13:05:22.8001783Z OK (skipped=1) 2022-08-17T13:05:22.8001949Z 2022-08-17T13:05:22.8002254Z Generating XML reports... 2022-08-17T13:05:22.8034278Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130521.xml 2022-08-17T13:05:24.5318151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:24.5318759Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:24.5319679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:24.5320155Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:24.7023844Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:05:24.7039292Z 2022-08-17T13:05:24.7039662Z Running tests... 2022-08-17T13:05:24.7040105Z ---------------------------------------------------------------------- 2022-08-17T13:05:26.1886236Z test_find_unused_parameters_kwarg_grad_is_view_debug_detail (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:05:26.2048590Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82979 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.501s) 2022-08-17T13:05:26.2049262Z 2022-08-17T13:05:26.2049549Z ---------------------------------------------------------------------- 2022-08-17T13:05:26.2049863Z Ran 1 test in 1.501s 2022-08-17T13:05:26.2050029Z 2022-08-17T13:05:26.2050138Z OK (skipped=1) 2022-08-17T13:05:26.2050295Z 2022-08-17T13:05:26.2050426Z Generating XML reports... 2022-08-17T13:05:26.2082128Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130524.xml 2022-08-17T13:05:27.9519607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:27.9520135Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:27.9521520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:27.9521999Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:28.1297830Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:05:28.1314876Z 2022-08-17T13:05:28.1315313Z Running tests... 2022-08-17T13:05:28.1315808Z ---------------------------------------------------------------------- 2022-08-17T13:05:29.6524076Z test_find_unused_parameters_kwarg_grad_is_view_debug_info (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:05:29.6686152Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82400 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.537s) 2022-08-17T13:05:29.6686756Z 2022-08-17T13:05:29.6687042Z ---------------------------------------------------------------------- 2022-08-17T13:05:29.6687376Z Ran 1 test in 1.537s 2022-08-17T13:05:29.6687545Z 2022-08-17T13:05:29.6687638Z OK (skipped=1) 2022-08-17T13:05:29.6687794Z 2022-08-17T13:05:29.6687921Z Generating XML reports... 2022-08-17T13:05:29.6720178Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130528.xml 2022-08-17T13:05:31.4313614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:31.4314125Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:31.4315359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:31.4315844Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:31.6113005Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:05:31.6129767Z 2022-08-17T13:05:31.6130183Z Running tests... 2022-08-17T13:05:31.6130683Z ---------------------------------------------------------------------- 2022-08-17T13:05:33.1140735Z test_find_unused_parameters_kwarg_grad_is_view_debug_off (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:05:33.1310573Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82500 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.518s) 2022-08-17T13:05:33.1311149Z 2022-08-17T13:05:33.1311428Z ---------------------------------------------------------------------- 2022-08-17T13:05:33.1311747Z Ran 1 test in 1.518s 2022-08-17T13:05:33.1311913Z 2022-08-17T13:05:33.1312032Z OK (skipped=1) 2022-08-17T13:05:33.1312188Z 2022-08-17T13:05:33.1312315Z Generating XML reports... 2022-08-17T13:05:33.1350696Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130531.xml 2022-08-17T13:05:34.9126898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:34.9127442Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:34.9128201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:34.9128682Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:35.0895491Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:05:35.0911814Z 2022-08-17T13:05:35.0912046Z Running tests... 2022-08-17T13:05:35.0912490Z ---------------------------------------------------------------------- 2022-08-17T13:05:36.5924516Z test_fp16 (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:05:36.6124005Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36278 2022-08-17T13:05:36.6131889Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36279 2022-08-17T13:05:38.0008318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:38.0008828Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:38.0009588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:38.0010085Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:38.0099552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:38.0100304Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:38.0102893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:38.0103378Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:38.1666075Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:05:38.1785349Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:05:39.4614702Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsev_8vj6 2022-08-17T13:05:39.4615647Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsev_8vj6/_remote_module_non_scriptable.py 2022-08-17T13:05:39.5061957Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp04e69v65 2022-08-17T13:05:39.5063127Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp04e69v65/_remote_module_non_scriptable.py 2022-08-17T13:05:40.5407198Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:05:40.5407800Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:05:40.5408511Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:05:40.5409035Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:05:41.4254365Z ok (6.334s) 2022-08-17T13:05:41.4254593Z 2022-08-17T13:05:41.4254982Z ---------------------------------------------------------------------- 2022-08-17T13:05:41.4255336Z Ran 1 test in 6.334s 2022-08-17T13:05:41.4255506Z 2022-08-17T13:05:41.4255601Z OK 2022-08-17T13:05:41.4255737Z 2022-08-17T13:05:41.4255877Z Generating XML reports... 2022-08-17T13:05:41.4291810Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130535.xml 2022-08-17T13:05:43.2321054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:43.2321587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:43.2323517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:43.2324004Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:43.4101124Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:05:43.4118416Z 2022-08-17T13:05:43.4118615Z Running tests... 2022-08-17T13:05:43.4119048Z ---------------------------------------------------------------------- 2022-08-17T13:05:44.9178334Z test_fp16_compress_wrapper_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:05:44.9373845Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36399 2022-08-17T13:05:44.9380518Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36400 2022-08-17T13:05:46.3554359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:46.3554865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:46.3555830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:46.3556310Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:46.3873570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:46.3874032Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:46.3877585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:46.3878091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:46.5268616Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:05:46.5271310Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:05:46.5535334Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:05:46.5537651Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:05:47.7989954Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplfl87yzh 2022-08-17T13:05:47.7990571Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplfl87yzh/_remote_module_non_scriptable.py 2022-08-17T13:05:47.8102912Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyzpe5d8s 2022-08-17T13:05:47.8105609Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyzpe5d8s/_remote_module_non_scriptable.py 2022-08-17T13:05:48.8655794Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:05:48.8656405Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:05:48.8657126Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:05:48.8657671Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:05:49.3494285Z ok (5.937s) 2022-08-17T13:05:49.3494548Z 2022-08-17T13:05:49.3495106Z ---------------------------------------------------------------------- 2022-08-17T13:05:49.3495459Z Ran 1 test in 5.937s 2022-08-17T13:05:49.3495627Z 2022-08-17T13:05:49.3495704Z OK 2022-08-17T13:05:49.3495841Z 2022-08-17T13:05:49.3495980Z Generating XML reports... 2022-08-17T13:05:49.3536681Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130543.xml 2022-08-17T13:05:51.1624453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:51.1625006Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:51.1626182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:51.1626664Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:51.3406380Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:05:51.3422132Z 2022-08-17T13:05:51.3422541Z Running tests... 2022-08-17T13:05:51.3423073Z ---------------------------------------------------------------------- 2022-08-17T13:05:52.8716462Z test_fp16_compress_wrapper_nccl (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:05:52.8903160Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36520 2022-08-17T13:05:52.8909864Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36521 2022-08-17T13:05:54.2804344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:54.2804928Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:54.2805568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:54.2806076Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:54.3021127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:54.3021598Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:54.3024503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:54.3025151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:54.4453974Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:05:54.4456033Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:05:54.4716924Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:05:54.4718460Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:05:55.7143919Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfjieem0a 2022-08-17T13:05:55.7144789Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfjieem0a/_remote_module_non_scriptable.py 2022-08-17T13:05:55.7267190Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg5h86y23 2022-08-17T13:05:55.7270085Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg5h86y23/_remote_module_non_scriptable.py 2022-08-17T13:05:56.7937203Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:05:56.7937815Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:05:56.7938529Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:05:56.7939103Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:05:57.3022978Z ok (5.960s) 2022-08-17T13:05:57.3023213Z 2022-08-17T13:05:57.3023947Z ---------------------------------------------------------------------- 2022-08-17T13:05:57.3024300Z Ran 1 test in 5.960s 2022-08-17T13:05:57.3024469Z 2022-08-17T13:05:57.3024546Z OK 2022-08-17T13:05:57.3024694Z 2022-08-17T13:05:57.3024834Z Generating XML reports... 2022-08-17T13:05:57.3059633Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130551.xml 2022-08-17T13:05:59.0426221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:05:59.0427223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:05:59.0428413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:05:59.0429397Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:05:59.2159874Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:05:59.2175112Z 2022-08-17T13:05:59.2175280Z Running tests... 2022-08-17T13:05:59.2175724Z ---------------------------------------------------------------------- 2022-08-17T13:06:00.6794273Z test_fp16_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:06:00.6982088Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36641 2022-08-17T13:06:00.6988368Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36642 2022-08-17T13:06:02.1121115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:02.1121631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:02.1122599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:02.1123074Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:02.1380937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:02.1381405Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:02.1384556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:02.1385033Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:02.2820818Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:06:02.3173262Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:06:03.5527591Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0cc6igbu 2022-08-17T13:06:03.5528701Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0cc6igbu/_remote_module_non_scriptable.py 2022-08-17T13:06:03.5993290Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9jtmgg8y 2022-08-17T13:06:03.5994399Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9jtmgg8y/_remote_module_non_scriptable.py 2022-08-17T13:06:04.6109807Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:06:04.6110811Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:06:04.6112036Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:06:04.6112955Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:06:05.5110899Z ok (6.293s) 2022-08-17T13:06:05.5111097Z 2022-08-17T13:06:05.5111487Z ---------------------------------------------------------------------- 2022-08-17T13:06:05.5111828Z Ran 1 test in 6.293s 2022-08-17T13:06:05.5112015Z 2022-08-17T13:06:05.5112110Z OK 2022-08-17T13:06:05.5112250Z 2022-08-17T13:06:05.5112369Z Generating XML reports... 2022-08-17T13:06:05.5147117Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130559.xml 2022-08-17T13:06:07.2617023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:07.2617572Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:07.2618639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:07.2619163Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:07.4312101Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:06:07.4327393Z 2022-08-17T13:06:07.4327722Z Running tests... 2022-08-17T13:06:07.4328504Z ---------------------------------------------------------------------- 2022-08-17T13:06:08.9108588Z test_grad_layout_1devicemodule_1replicaperprocess (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:06:08.9296255Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36762 2022-08-17T13:06:08.9301911Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36763 2022-08-17T13:06:10.3107480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:10.3107989Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:10.3108741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:10.3109551Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:10.3456044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:10.3456507Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:10.3459546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:10.3460027Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:10.4765063Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:06:10.5118982Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:06:11.7592407Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6z0ck3hm 2022-08-17T13:06:11.7593321Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6z0ck3hm/_remote_module_non_scriptable.py 2022-08-17T13:06:11.7656538Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjchuccks 2022-08-17T13:06:11.7659413Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjchuccks/_remote_module_non_scriptable.py 2022-08-17T13:06:13.9873204Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:06:13.9874388Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:06:13.9955709Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:06:13.9956295Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:06:14.0054448Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.0054987Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.0344577Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.0347897Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.0735697Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.0741634Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.1051709Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.1052963Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.1354407Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.1357326Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.1667503Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.1671260Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.1971733Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.1974869Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.2277446Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.2281085Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.2581505Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.2585172Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.2899845Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.2900785Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.3207001Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.3210622Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.3525250Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.3528423Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.3825556Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.3828949Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.4126702Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.4129515Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.4424537Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.4427883Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.4735944Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.4738153Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.5038170Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.5041528Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.5350848Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:14.5354075Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:15.2458501Z ok (7.813s) 2022-08-17T13:06:15.2458771Z 2022-08-17T13:06:15.2459168Z ---------------------------------------------------------------------- 2022-08-17T13:06:15.2459533Z Ran 1 test in 7.813s 2022-08-17T13:06:15.2459702Z 2022-08-17T13:06:15.2459797Z OK 2022-08-17T13:06:15.2459915Z 2022-08-17T13:06:15.2460052Z Generating XML reports... 2022-08-17T13:06:15.2495450Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130607.xml 2022-08-17T13:06:17.0258928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:17.0259433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:17.0260203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:17.0260670Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:17.2008773Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:06:17.2024627Z 2022-08-17T13:06:17.2025161Z Running tests... 2022-08-17T13:06:17.2025948Z ---------------------------------------------------------------------- 2022-08-17T13:06:18.7209034Z test_grad_layout_2devicemodule (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:06:18.7403309Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36883 2022-08-17T13:06:18.7409572Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36884 2022-08-17T13:06:20.1323705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:20.1324210Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:20.1325277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:20.1326124Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:20.1616736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:20.1617188Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:20.1620089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:20.1620568Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:20.2984305Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:06:20.3351113Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:06:20.6465359Z skip: Need at least 4 CUDA devices (3.444s) 2022-08-17T13:06:20.6465806Z 2022-08-17T13:06:20.6466591Z ---------------------------------------------------------------------- 2022-08-17T13:06:20.6467080Z Ran 1 test in 3.444s 2022-08-17T13:06:20.6467246Z 2022-08-17T13:06:20.6467356Z OK (skipped=1) 2022-08-17T13:06:20.6467510Z 2022-08-17T13:06:20.6467629Z Generating XML reports... 2022-08-17T13:06:20.6502120Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130617.xml 2022-08-17T13:06:22.3971016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:22.3971614Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:22.3972888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:22.3973659Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:22.5740556Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:06:22.5755826Z 2022-08-17T13:06:22.5756703Z Running tests... 2022-08-17T13:06:22.5757582Z ---------------------------------------------------------------------- 2022-08-17T13:06:24.0866122Z test_invalid_powerSGD_state (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:06:24.1061715Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36986 2022-08-17T13:06:24.1068606Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36987 2022-08-17T13:06:25.5525249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:25.5525765Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:25.5526537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:25.5527018Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:25.5615914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:25.5616380Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:25.5619549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:25.5620052Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:25.7238225Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:06:25.7243748Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:06:25.7244868Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:06:25.7246828Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:06:25.7247900Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:06:25.7248981Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:06:25.7250041Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:06:25.7271767Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:06:25.7277799Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:06:25.7278880Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:06:25.7279932Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:06:25.7281185Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:06:25.7282373Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:06:25.7283514Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:06:26.0124030Z ok (3.436s) 2022-08-17T13:06:26.0124248Z 2022-08-17T13:06:26.0124645Z ---------------------------------------------------------------------- 2022-08-17T13:06:26.0125006Z Ran 1 test in 3.437s 2022-08-17T13:06:26.0125156Z 2022-08-17T13:06:26.0125252Z OK 2022-08-17T13:06:26.0125966Z 2022-08-17T13:06:26.0126122Z Generating XML reports... 2022-08-17T13:06:26.0161194Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130622.xml 2022-08-17T13:06:27.7908189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:27.7908701Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:27.7910012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:27.7910739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:27.9691350Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:06:27.9706966Z 2022-08-17T13:06:27.9707430Z Running tests... 2022-08-17T13:06:27.9708349Z ---------------------------------------------------------------------- 2022-08-17T13:06:29.4683216Z test_multiple_outputs_multiple_backward (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:06:29.4875503Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37089 2022-08-17T13:06:29.4882058Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37090 2022-08-17T13:06:30.8757365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:30.8758341Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:30.8759501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:30.8760446Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:30.9104328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:30.9105256Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:30.9107132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:30.9108066Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:31.0429604Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:06:31.0842314Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:06:32.2833667Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc8adoy1a 2022-08-17T13:06:32.2835382Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc8adoy1a/_remote_module_non_scriptable.py 2022-08-17T13:06:32.3604796Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_xrm7bfc 2022-08-17T13:06:32.3606408Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_xrm7bfc/_remote_module_non_scriptable.py 2022-08-17T13:06:33.4547863Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:06:33.4548905Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:06:33.4550227Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:06:33.4551275Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:06:34.4005995Z ok (6.430s) 2022-08-17T13:06:34.4006553Z 2022-08-17T13:06:34.4006947Z ---------------------------------------------------------------------- 2022-08-17T13:06:34.4007278Z Ran 1 test in 6.430s 2022-08-17T13:06:34.4007442Z 2022-08-17T13:06:34.4007536Z OK 2022-08-17T13:06:34.4007671Z 2022-08-17T13:06:34.4007815Z Generating XML reports... 2022-08-17T13:06:34.4045442Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130627.xml 2022-08-17T13:06:36.1704552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:36.1716808Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:36.1717479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:36.1718218Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:36.3452844Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:06:36.3468641Z 2022-08-17T13:06:36.3468791Z Running tests... 2022-08-17T13:06:36.3469464Z ---------------------------------------------------------------------- 2022-08-17T13:06:37.8390042Z test_multiple_outputs_multiple_backward_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:06:37.8578946Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37210 2022-08-17T13:06:37.8585603Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37211 2022-08-17T13:06:39.2899399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:39.2899899Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:39.2900866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:39.2901366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:39.3044989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:39.3045466Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:39.3048077Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:39.3048556Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:39.4568822Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:06:39.4693769Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:06:40.7330139Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpflxmeudq 2022-08-17T13:06:40.7331243Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpflxmeudq/_remote_module_non_scriptable.py 2022-08-17T13:06:40.7392864Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq2pvjvrx 2022-08-17T13:06:40.7395867Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq2pvjvrx/_remote_module_non_scriptable.py 2022-08-17T13:06:41.7930244Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:06:41.7931429Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:06:41.7932856Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:06:41.7933956Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:06:42.7709860Z ok (6.424s) 2022-08-17T13:06:42.7710065Z 2022-08-17T13:06:42.7710464Z ---------------------------------------------------------------------- 2022-08-17T13:06:42.7711171Z Ran 1 test in 6.424s 2022-08-17T13:06:42.7711337Z 2022-08-17T13:06:42.7711412Z OK 2022-08-17T13:06:42.7711545Z 2022-08-17T13:06:42.7711680Z Generating XML reports... 2022-08-17T13:06:42.7746114Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130636.xml 2022-08-17T13:06:44.5587512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:44.5588250Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:44.5589455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:44.5589937Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:44.7350064Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:06:44.7366619Z 2022-08-17T13:06:44.7366889Z Running tests... 2022-08-17T13:06:44.7367325Z ---------------------------------------------------------------------- 2022-08-17T13:06:46.2438036Z test_nccl_backend_1gpu_module_device_ids_integer_list (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:06:46.2624599Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37331 2022-08-17T13:06:46.2631414Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37332 2022-08-17T13:06:47.6654349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:47.6655344Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:47.6656562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:47.6657476Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:47.6893499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:47.6894380Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:47.6897562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:47.6898527Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:47.8293590Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:06:47.8621741Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:06:49.0913388Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbjqu2b84 2022-08-17T13:06:49.0914196Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbjqu2b84/_remote_module_non_scriptable.py 2022-08-17T13:06:49.1064987Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps9rmj_b6 2022-08-17T13:06:49.1068101Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps9rmj_b6/_remote_module_non_scriptable.py 2022-08-17T13:06:50.5734213Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:06:50.5734824Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:06:50.5798652Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:06:50.5799216Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:06:50.5852893Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:50.5853404Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:51.0752617Z ok (6.338s) 2022-08-17T13:06:51.0752774Z 2022-08-17T13:06:51.0753391Z ---------------------------------------------------------------------- 2022-08-17T13:06:51.0753753Z Ran 1 test in 6.338s 2022-08-17T13:06:51.0753934Z 2022-08-17T13:06:51.0754031Z OK 2022-08-17T13:06:51.0754168Z 2022-08-17T13:06:51.0754280Z Generating XML reports... 2022-08-17T13:06:51.0790608Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130644.xml 2022-08-17T13:06:52.8626643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:52.8627123Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:52.8628513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:52.8628994Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:53.0383596Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:06:53.0400121Z 2022-08-17T13:06:53.0400346Z Running tests... 2022-08-17T13:06:53.0400799Z ---------------------------------------------------------------------- 2022-08-17T13:06:54.5547441Z test_nccl_backend_1gpu_module_device_ids_torch_device_list (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:06:54.5744510Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37452 2022-08-17T13:06:54.5750972Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37453 2022-08-17T13:06:55.9663764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:55.9664473Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:55.9665083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:55.9665589Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:55.9874810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:06:55.9875273Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:06:55.9878698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:06:55.9879178Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:06:56.1323043Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:06:56.1595497Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:06:57.3935255Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn9vbdaa3 2022-08-17T13:06:57.3935877Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn9vbdaa3/_remote_module_non_scriptable.py 2022-08-17T13:06:57.4494043Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp814pjgzi 2022-08-17T13:06:57.4496089Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp814pjgzi/_remote_module_non_scriptable.py 2022-08-17T13:06:58.9692001Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:06:58.9692608Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:06:58.9721632Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:06:58.9722205Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:06:58.9760074Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:58.9779963Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:06:59.4875673Z ok (6.447s) 2022-08-17T13:06:59.4875849Z 2022-08-17T13:06:59.4876282Z ---------------------------------------------------------------------- 2022-08-17T13:06:59.4876623Z Ran 1 test in 6.448s 2022-08-17T13:06:59.4876791Z 2022-08-17T13:06:59.4876884Z OK 2022-08-17T13:06:59.4877019Z 2022-08-17T13:06:59.4877137Z Generating XML reports... 2022-08-17T13:06:59.4912914Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130653.xml 2022-08-17T13:07:01.2291159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:01.2291657Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:01.2292752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:01.2293248Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:01.3983232Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:07:01.3998715Z 2022-08-17T13:07:01.3998840Z Running tests... 2022-08-17T13:07:01.3999719Z ---------------------------------------------------------------------- 2022-08-17T13:07:02.8716553Z test_nccl_backend_2gpu_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:07:02.8903226Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37573 2022-08-17T13:07:02.8909245Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37574 2022-08-17T13:07:04.3462419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:04.3463122Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:04.3464238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:04.3464712Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:04.3795745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:04.3796212Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:04.3799525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:04.3800002Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:04.5163144Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:07:04.5520142Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:07:04.8962907Z skip: Need at least 4 CUDA devices (3.496s) 2022-08-17T13:07:04.8963142Z 2022-08-17T13:07:04.8963538Z ---------------------------------------------------------------------- 2022-08-17T13:07:04.8963885Z Ran 1 test in 3.496s 2022-08-17T13:07:04.8964342Z 2022-08-17T13:07:04.8964458Z OK (skipped=1) 2022-08-17T13:07:04.8964621Z 2022-08-17T13:07:04.8964757Z Generating XML reports... 2022-08-17T13:07:04.8999761Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130701.xml 2022-08-17T13:07:06.6853251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:06.6854082Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:06.6854997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:06.6855481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:06.8622722Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:07:06.8638881Z 2022-08-17T13:07:06.8639081Z Running tests... 2022-08-17T13:07:06.8639548Z ---------------------------------------------------------------------- 2022-08-17T13:07:08.3874506Z test_nccl_backend_4gpu_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:07:08.4068788Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37676 2022-08-17T13:07:08.4074826Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37677 2022-08-17T13:07:09.8263672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:09.8264239Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:09.8264852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:09.8265357Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:09.8320724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:09.8321174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:09.8324757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:09.8325249Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:09.9958878Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:07:10.0030887Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:07:10.3130700Z skip: Need at least 8 CUDA devices (3.449s) 2022-08-17T13:07:10.3130948Z 2022-08-17T13:07:10.3131324Z ---------------------------------------------------------------------- 2022-08-17T13:07:10.3131666Z Ran 1 test in 3.449s 2022-08-17T13:07:10.3131837Z 2022-08-17T13:07:10.3131948Z OK (skipped=1) 2022-08-17T13:07:10.3132108Z 2022-08-17T13:07:10.3132250Z Generating XML reports... 2022-08-17T13:07:10.3167880Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130706.xml 2022-08-17T13:07:12.1034313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:12.1035131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:12.1036048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:12.1036510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:12.2808207Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:07:12.2824650Z 2022-08-17T13:07:12.2824883Z Running tests... 2022-08-17T13:07:12.2825315Z ---------------------------------------------------------------------- 2022-08-17T13:07:13.7938557Z test_nccl_backend_multi_device_ids_not_allowed (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:07:13.8134432Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37779 2022-08-17T13:07:13.8140720Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37780 2022-08-17T13:07:15.1729637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:15.1730138Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:15.1731145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:15.1731632Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:15.1931151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:15.1931614Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:15.1934329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:15.1934822Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:15.3390317Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:07:15.3671469Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:07:17.0226485Z ok (4.740s) 2022-08-17T13:07:17.0226820Z 2022-08-17T13:07:17.0227272Z ---------------------------------------------------------------------- 2022-08-17T13:07:17.0227635Z Ran 1 test in 4.740s 2022-08-17T13:07:17.0227803Z 2022-08-17T13:07:17.0227900Z OK 2022-08-17T13:07:17.0228059Z 2022-08-17T13:07:17.0228196Z Generating XML reports... 2022-08-17T13:07:17.0264611Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130712.xml 2022-08-17T13:07:18.8091266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:18.8091789Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:18.8095948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:18.8096425Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:18.9860372Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:07:18.9876403Z 2022-08-17T13:07:18.9876845Z Running tests... 2022-08-17T13:07:18.9877331Z ---------------------------------------------------------------------- 2022-08-17T13:07:20.4879316Z test_nccl_backend_multi_device_module_device_ids_None (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:07:20.5072797Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37888 2022-08-17T13:07:20.5078874Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37889 2022-08-17T13:07:21.9436550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:21.9437074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:21.9438010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:21.9438502Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:21.9862613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:21.9863104Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:21.9865746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:21.9866517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:22.1115827Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:07:22.1619566Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:07:22.5136044Z skip: Need at least 4 CUDA devices (3.526s) 2022-08-17T13:07:22.5136490Z 2022-08-17T13:07:22.5137259Z ---------------------------------------------------------------------- 2022-08-17T13:07:22.5137757Z Ran 1 test in 3.526s 2022-08-17T13:07:22.5137927Z 2022-08-17T13:07:22.5138043Z OK (skipped=1) 2022-08-17T13:07:22.5138182Z 2022-08-17T13:07:22.5138314Z Generating XML reports... 2022-08-17T13:07:22.5173712Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130718.xml 2022-08-17T13:07:24.2854835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:24.2855377Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:24.2855980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:24.2856459Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:24.4609859Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:07:24.4625955Z 2022-08-17T13:07:24.4626183Z Running tests... 2022-08-17T13:07:24.4626613Z ---------------------------------------------------------------------- 2022-08-17T13:07:25.9600397Z test_nccl_backend_single_device_module_device_ids_None (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:07:25.9793901Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37991 2022-08-17T13:07:25.9800171Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37992 2022-08-17T13:07:27.3260082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:27.3260705Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:27.3261477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:27.3261966Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:27.3648334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:27.3648787Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:27.3651329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:27.3651812Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:27.4918537Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:07:27.5375803Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:07:28.7540385Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp475x7e57 2022-08-17T13:07:28.7540978Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp475x7e57/_remote_module_non_scriptable.py 2022-08-17T13:07:28.8320748Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnyf8wlle 2022-08-17T13:07:28.8321591Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnyf8wlle/_remote_module_non_scriptable.py 2022-08-17T13:07:30.2481635Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:07:30.2482200Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:07:30.7922313Z ok (6.329s) 2022-08-17T13:07:30.7922519Z 2022-08-17T13:07:30.7923223Z ---------------------------------------------------------------------- 2022-08-17T13:07:30.7923581Z Ran 1 test in 6.330s 2022-08-17T13:07:30.7923768Z 2022-08-17T13:07:30.7923861Z OK 2022-08-17T13:07:30.7924005Z 2022-08-17T13:07:30.7924142Z Generating XML reports... 2022-08-17T13:07:30.7959454Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130724.xml 2022-08-17T13:07:32.5699754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:32.5700260Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:32.5701291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:32.5702046Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:32.7454260Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:07:32.7469753Z 2022-08-17T13:07:32.7470245Z Running tests... 2022-08-17T13:07:32.7470735Z ---------------------------------------------------------------------- 2022-08-17T13:07:34.2431595Z test_nccl_backend_single_device_module_empty_device_ids (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:07:34.2620131Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38112 2022-08-17T13:07:34.2627072Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38113 2022-08-17T13:07:35.6889584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:35.6890093Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:35.6890711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:35.6891183Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:35.6999740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:35.7000202Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:35.7002896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:35.7003355Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:35.8575314Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:07:35.8742680Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:07:37.1378528Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwqa6hy7b 2022-08-17T13:07:37.1379198Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwqa6hy7b/_remote_module_non_scriptable.py 2022-08-17T13:07:37.1596553Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7uh6k3il 2022-08-17T13:07:37.1599303Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7uh6k3il/_remote_module_non_scriptable.py 2022-08-17T13:07:38.6623993Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:07:38.6641794Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:07:39.1753504Z ok (6.428s) 2022-08-17T13:07:39.1753821Z 2022-08-17T13:07:39.1754434Z ---------------------------------------------------------------------- 2022-08-17T13:07:39.1755024Z Ran 1 test in 6.428s 2022-08-17T13:07:39.1755306Z 2022-08-17T13:07:39.1755483Z OK 2022-08-17T13:07:39.1755730Z 2022-08-17T13:07:39.1755971Z Generating XML reports... 2022-08-17T13:07:39.1791748Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130732.xml 2022-08-17T13:07:40.9392498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:40.9393047Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:40.9394065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:40.9394579Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:41.1092902Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:07:41.1108238Z 2022-08-17T13:07:41.1108407Z Running tests... 2022-08-17T13:07:41.1108842Z ---------------------------------------------------------------------- 2022-08-17T13:07:42.5709222Z test_nccl_propagate_error_reason (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:07:42.5897099Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38233 2022-08-17T13:07:42.5903427Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38234 2022-08-17T13:07:44.0098932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:44.0099415Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:44.0100843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:44.0101320Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:44.0295007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:07:44.0295495Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:07:44.0298126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:07:44.0298598Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:07:44.1750540Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:07:44.2010997Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:08:03.0368517Z ok (21.926s) 2022-08-17T13:08:03.0368840Z 2022-08-17T13:08:03.0370848Z ---------------------------------------------------------------------- 2022-08-17T13:08:03.0371406Z Ran 1 test in 21.926s 2022-08-17T13:08:03.0371581Z 2022-08-17T13:08:03.0371688Z OK 2022-08-17T13:08:03.0371827Z 2022-08-17T13:08:03.0371966Z Generating XML reports... 2022-08-17T13:08:03.0405239Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130741.xml 2022-08-17T13:08:04.7658238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:04.7659255Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:04.7660468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:04.7661357Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:04.9418296Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:08:04.9435664Z 2022-08-17T13:08:04.9436046Z Running tests... 2022-08-17T13:08:04.9436898Z ---------------------------------------------------------------------- 2022-08-17T13:08:04.9454740Z test_no_grad (__main__.DistributedDataParallelTest) 2022-08-17T13:08:06.4434445Z Note: this test can be sped up by only running it on a CPU module ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:08:06.4620999Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38354 2022-08-17T13:08:06.4628207Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38355 2022-08-17T13:08:07.8531600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:07.8532144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:07.8533186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:07.8533690Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:07.8945922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:07.8946393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:07.8949188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:07.8949913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:08.0182773Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:08:08.0627040Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:08:09.2681801Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbzs7gf6z 2022-08-17T13:08:09.2682557Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbzs7gf6z/_remote_module_non_scriptable.py 2022-08-17T13:08:09.3114634Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt7ccbekg 2022-08-17T13:08:09.3116600Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt7ccbekg/_remote_module_non_scriptable.py 2022-08-17T13:08:10.3694417Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:08:10.3695058Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:08:10.3695753Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:08:10.3696304Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:08:11.2752197Z ok (6.331s) 2022-08-17T13:08:11.2752404Z 2022-08-17T13:08:11.2752799Z ---------------------------------------------------------------------- 2022-08-17T13:08:11.2753143Z Ran 1 test in 6.332s 2022-08-17T13:08:11.2753312Z 2022-08-17T13:08:11.2755845Z OK 2022-08-17T13:08:11.2756102Z 2022-08-17T13:08:11.2756254Z Generating XML reports... 2022-08-17T13:08:11.2857306Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130804.xml 2022-08-17T13:08:13.0654884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:13.0655392Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:13.0656153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:13.0656641Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:13.2409220Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:08:13.2425158Z 2022-08-17T13:08:13.2425433Z Running tests... 2022-08-17T13:08:13.2425869Z ---------------------------------------------------------------------- 2022-08-17T13:08:14.7598555Z test_param_layout_mismatch_error (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:08:14.7791298Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38471 2022-08-17T13:08:14.7797892Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38472 2022-08-17T13:08:16.1912330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:16.1913304Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:16.1914511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:16.1915428Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:16.2055673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:16.2056608Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:16.2058571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:16.2059818Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:16.3572239Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:08:16.3776648Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:08:17.6162331Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdd7bzssf 2022-08-17T13:08:17.6163483Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdd7bzssf/_remote_module_non_scriptable.py 2022-08-17T13:08:17.6919782Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl22qwqig 2022-08-17T13:08:17.6921016Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl22qwqig/_remote_module_non_scriptable.py 2022-08-17T13:08:19.1907020Z ok (5.948s) 2022-08-17T13:08:19.1907251Z 2022-08-17T13:08:19.1907667Z ---------------------------------------------------------------------- 2022-08-17T13:08:19.1908032Z Ran 1 test in 5.948s 2022-08-17T13:08:19.1908202Z 2022-08-17T13:08:19.1908300Z OK 2022-08-17T13:08:19.1908418Z 2022-08-17T13:08:19.1908557Z Generating XML reports... 2022-08-17T13:08:19.1943861Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130813.xml 2022-08-17T13:08:20.9664011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:20.9664958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:20.9666302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:20.9666788Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:21.1435150Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:08:21.1451824Z 2022-08-17T13:08:21.1452070Z Running tests... 2022-08-17T13:08:21.1452504Z ---------------------------------------------------------------------- 2022-08-17T13:08:22.6487596Z test_pass_default_pg (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:08:22.6684632Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38588 2022-08-17T13:08:22.6690952Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38589 2022-08-17T13:08:24.1182849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:24.1183829Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:24.1184448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:24.1184928Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:24.1329862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:24.1330349Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:24.1334211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:24.1334711Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:24.2828076Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:08:24.2831970Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:08:24.3044840Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:08:24.3050500Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:08:24.3051280Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:08:24.3138869Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:08:24.5744730Z ok (3.429s) 2022-08-17T13:08:24.5744955Z 2022-08-17T13:08:24.5745346Z ---------------------------------------------------------------------- 2022-08-17T13:08:24.5745688Z Ran 1 test in 3.429s 2022-08-17T13:08:24.5745836Z 2022-08-17T13:08:24.5745934Z OK 2022-08-17T13:08:24.5746069Z 2022-08-17T13:08:24.5746204Z Generating XML reports... 2022-08-17T13:08:24.5780961Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130821.xml 2022-08-17T13:08:26.3546961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:26.3547474Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:26.3548850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:26.3549349Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:26.5306064Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:08:26.5322223Z 2022-08-17T13:08:26.5322645Z Running tests... 2022-08-17T13:08:26.5323143Z ---------------------------------------------------------------------- 2022-08-17T13:08:28.0323334Z test_powerSGD_ddp_comm_hook_nccl (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:08:28.0519335Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38695 2022-08-17T13:08:28.0525492Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38696 2022-08-17T13:08:29.4571346Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:29.4571872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:29.4572680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:29.4573149Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:29.4796331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:29.4796796Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:29.4799361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:29.4799820Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:29.6236478Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:08:29.6239076Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-08-17T13:08:29.6537191Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:08:29.6539082Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-08-17T13:08:30.8835219Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprw3ro5da 2022-08-17T13:08:30.8835813Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprw3ro5da/_remote_module_non_scriptable.py 2022-08-17T13:08:30.9474294Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb3e6mnpc 2022-08-17T13:08:30.9475238Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb3e6mnpc/_remote_module_non_scriptable.py 2022-08-17T13:08:31.9448144Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:08:31.9448731Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:08:31.9449444Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:08:31.9449989Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:08:31.9493069Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:08:31.9494179Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:08:31.9542272Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-08-17T13:08:31.9543532Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-08-17T13:08:31.9591698Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:08:31.9592770Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:08:31.9640994Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-08-17T13:08:31.9642102Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-08-17T13:08:31.9689657Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:08:31.9690878Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:08:31.9738401Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-08-17T13:08:31.9739463Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-08-17T13:08:31.9787431Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:08:31.9788494Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:08:32.4637499Z ok (5.931s) 2022-08-17T13:08:32.4637699Z 2022-08-17T13:08:32.4638103Z ---------------------------------------------------------------------- 2022-08-17T13:08:32.4638460Z Ran 1 test in 5.931s 2022-08-17T13:08:32.4638624Z 2022-08-17T13:08:32.4638701Z OK 2022-08-17T13:08:32.4638834Z 2022-08-17T13:08:32.4638969Z Generating XML reports... 2022-08-17T13:08:32.4674216Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130826.xml 2022-08-17T13:08:34.2507379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:34.2507891Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:34.2508903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:34.2509379Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:34.4290579Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:08:34.4306690Z 2022-08-17T13:08:34.4306907Z Running tests... 2022-08-17T13:08:34.4307333Z ---------------------------------------------------------------------- 2022-08-17T13:08:35.9343127Z test_powerSGD_ddp_comm_hook_nccl_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:08:35.9532021Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38816 2022-08-17T13:08:35.9538512Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38817 2022-08-17T13:08:37.3724055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:37.3724612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:37.3725735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:37.3726224Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:37.3819300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:37.3819777Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:37.3822534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:37.3823012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:37.5384684Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:08:37.5387435Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-08-17T13:08:37.5541233Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:08:37.5542974Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-08-17T13:08:38.8142550Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcxvjwq4f 2022-08-17T13:08:38.8143151Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcxvjwq4f/_remote_module_non_scriptable.py 2022-08-17T13:08:38.8517715Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzdkfows7 2022-08-17T13:08:38.8518693Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzdkfows7/_remote_module_non_scriptable.py 2022-08-17T13:08:39.9315730Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:08:39.9316333Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:08:39.9317045Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:08:39.9317587Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:08:39.9434221Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:08:39.9435326Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:08:39.9482111Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-08-17T13:08:39.9483302Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-08-17T13:08:39.9530374Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:08:39.9531463Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:08:39.9578691Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-08-17T13:08:39.9579777Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-08-17T13:08:39.9626883Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:08:39.9627955Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:08:39.9675529Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-08-17T13:08:39.9676663Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = True 2022-08-17T13:08:39.9723086Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:08:39.9724309Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1000; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:08:40.4654774Z ok (6.034s) 2022-08-17T13:08:40.4654993Z 2022-08-17T13:08:40.4655384Z ---------------------------------------------------------------------- 2022-08-17T13:08:40.4655709Z Ran 1 test in 6.035s 2022-08-17T13:08:40.4655885Z 2022-08-17T13:08:40.4655981Z OK 2022-08-17T13:08:40.4656125Z 2022-08-17T13:08:40.4656260Z Generating XML reports... 2022-08-17T13:08:40.4691551Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130834.xml 2022-08-17T13:08:42.2367021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:42.2367529Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:42.2368773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:42.2369234Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:42.4115945Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:08:42.4132021Z 2022-08-17T13:08:42.4132553Z Running tests... 2022-08-17T13:08:42.4133232Z ---------------------------------------------------------------------- 2022-08-17T13:08:43.9152156Z test_sync_batch_norm_empty_input (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:08:43.9345654Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38937 2022-08-17T13:08:43.9351841Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38938 2022-08-17T13:08:45.3445221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:45.3445743Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:45.3446528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:45.3446993Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:45.3677972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:45.3678436Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:45.3681252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:45.3681726Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:45.5117126Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:08:45.5437071Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:08:46.8271198Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprzmlw_xz 2022-08-17T13:08:46.8271813Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprzmlw_xz/_remote_module_non_scriptable.py 2022-08-17T13:08:46.8326796Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu270kekb 2022-08-17T13:08:46.8329640Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu270kekb/_remote_module_non_scriptable.py 2022-08-17T13:08:47.8846708Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:08:47.8847336Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:08:47.8848309Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:08:47.8848883Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:08:49.0961805Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:08:49.0962352Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:08:49.7495348Z ok (7.336s) 2022-08-17T13:08:49.7495571Z 2022-08-17T13:08:49.7495962Z ---------------------------------------------------------------------- 2022-08-17T13:08:49.7496303Z Ran 1 test in 7.336s 2022-08-17T13:08:49.7496471Z 2022-08-17T13:08:49.7496822Z OK 2022-08-17T13:08:49.7496973Z 2022-08-17T13:08:49.7497115Z Generating XML reports... 2022-08-17T13:08:49.7531911Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130842.xml 2022-08-17T13:08:51.5427608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:51.5428101Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:51.5429445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:51.5429931Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:51.7207669Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:08:51.7223171Z 2022-08-17T13:08:51.7223604Z Running tests... 2022-08-17T13:08:51.7224087Z ---------------------------------------------------------------------- 2022-08-17T13:08:53.2298617Z test_sync_batch_norm_only_empty_input (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:08:53.2494667Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39058 2022-08-17T13:08:53.2500866Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39059 2022-08-17T13:08:54.6773906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:54.6774461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:54.6775226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:54.6775707Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:54.6990781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:08:54.6991273Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:08:54.6994051Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:08:54.6994531Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:08:54.8498727Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:08:54.8672457Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:08:56.1199413Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp83f7sexw 2022-08-17T13:08:56.1200023Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp83f7sexw/_remote_module_non_scriptable.py 2022-08-17T13:08:56.1591507Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp85gmjrkr 2022-08-17T13:08:56.1593623Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp85gmjrkr/_remote_module_non_scriptable.py 2022-08-17T13:08:57.2495814Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:08:57.2496706Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:08:57.2497441Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:08:57.2497993Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:08:57.8824481Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:08:57.8825013Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:08:58.4631379Z ok (6.740s) 2022-08-17T13:08:58.4631758Z 2022-08-17T13:08:58.4632167Z ---------------------------------------------------------------------- 2022-08-17T13:08:58.4632861Z Ran 1 test in 6.741s 2022-08-17T13:08:58.4633038Z 2022-08-17T13:08:58.4633133Z OK 2022-08-17T13:08:58.4633276Z 2022-08-17T13:08:58.4633417Z Generating XML reports... 2022-08-17T13:08:58.4668151Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130851.xml 2022-08-17T13:09:00.2284133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:00.2285122Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:00.2286281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:00.2287195Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:00.4056260Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:09:00.4074106Z 2022-08-17T13:09:00.4074568Z Running tests... 2022-08-17T13:09:00.4075073Z ---------------------------------------------------------------------- 2022-08-17T13:09:01.9084610Z test_invalid_nccl_blocking_wait_env (__main__.NcclErrorHandlingTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:09:01.9273319Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39179 2022-08-17T13:09:01.9279587Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39180 2022-08-17T13:09:01.9285655Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 39181 2022-08-17T13:09:03.3283188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:03.3283701Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:03.3284483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:03.3284982Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:03.3600026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:03.3600535Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:03.3601881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:03.3602352Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:03.3835634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:03.3836102Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:03.3838653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:03.3839306Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:03.4976640Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:09:03.5360858Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:09:03.5508655Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:09:03.8341786Z skip: Need at least 3 CUDA devices (3.426s) 2022-08-17T13:09:03.8342041Z 2022-08-17T13:09:03.8342450Z ---------------------------------------------------------------------- 2022-08-17T13:09:03.8342796Z Ran 1 test in 3.427s 2022-08-17T13:09:03.8342962Z 2022-08-17T13:09:03.8343075Z OK (skipped=1) 2022-08-17T13:09:03.8343238Z 2022-08-17T13:09:03.8344551Z Generating XML reports... 2022-08-17T13:09:03.8379533Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130900.xml 2022-08-17T13:09:05.6242824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:05.6243747Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:05.6244559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:05.6245048Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:05.8007870Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:09:05.8024002Z 2022-08-17T13:09:05.8024145Z Running tests... 2022-08-17T13:09:05.8025350Z ---------------------------------------------------------------------- 2022-08-17T13:09:07.2964810Z test_nccl_blocking_wait_with_barrier (__main__.NcclErrorHandlingTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:09:07.3160957Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39316 2022-08-17T13:09:07.3167137Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39317 2022-08-17T13:09:07.3173622Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 39318 2022-08-17T13:09:08.7363705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:08.7364215Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:08.7365270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:08.7365748Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:08.7697103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:08.7697553Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:08.7698117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:08.7698578Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:08.7700478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:08.7700937Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:08.7701533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:08.7701996Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:08.9059993Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:09:08.9415013Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:09:08.9473073Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:09:09.3233278Z skip: Need at least 3 CUDA devices (3.521s) 2022-08-17T13:09:09.3233549Z 2022-08-17T13:09:09.3233942Z ---------------------------------------------------------------------- 2022-08-17T13:09:09.3234267Z Ran 1 test in 3.521s 2022-08-17T13:09:09.3234435Z 2022-08-17T13:09:09.3234805Z OK (skipped=1) 2022-08-17T13:09:09.3234983Z 2022-08-17T13:09:09.3235124Z Generating XML reports... 2022-08-17T13:09:09.3270911Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130905.xml 2022-08-17T13:09:11.1190705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:11.1191189Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:11.1192185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:11.1192660Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:11.2950927Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:09:11.2966869Z 2022-08-17T13:09:11.2967024Z Running tests... 2022-08-17T13:09:11.2967475Z ---------------------------------------------------------------------- 2022-08-17T13:09:11.2973424Z test_nccl_errors_blocking_abort (__main__.NcclErrorHandlingTest) ... skip: Frequently times out see https://github.com/pytorch/pytorch/issues/58920 (0.000s) 2022-08-17T13:09:11.2973791Z 2022-08-17T13:09:11.2974071Z ---------------------------------------------------------------------- 2022-08-17T13:09:11.2974387Z Ran 1 test in 0.001s 2022-08-17T13:09:11.2974555Z 2022-08-17T13:09:11.2974668Z OK (skipped=1) 2022-08-17T13:09:11.2974823Z 2022-08-17T13:09:11.2974959Z Generating XML reports... 2022-08-17T13:09:11.3007076Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130911.xml 2022-08-17T13:09:12.9110052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:12.9110603Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:12.9111905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:12.9112634Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:13.0811385Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:09:13.0827138Z 2022-08-17T13:09:13.0827515Z Running tests... 2022-08-17T13:09:13.0828397Z ---------------------------------------------------------------------- 2022-08-17T13:09:14.5631028Z test_nccl_errors_blocking_clean_exit (__main__.NcclErrorHandlingTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:09:14.5817181Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39486 2022-08-17T13:09:14.5823481Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39487 2022-08-17T13:09:14.5830158Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 39488 2022-08-17T13:09:15.9870884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:15.9871617Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:15.9873180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:15.9873665Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:16.0015225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:16.0015689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:16.0018629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:16.0019120Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:16.0093543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:16.0094290Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:16.0096982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:16.0097459Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:16.1561031Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:09:16.1700411Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:09:16.1798284Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:09:16.4886383Z skip: Need at least 3 CUDA devices (3.406s) 2022-08-17T13:09:16.4886939Z 2022-08-17T13:09:16.4887314Z ---------------------------------------------------------------------- 2022-08-17T13:09:16.4887667Z Ran 1 test in 3.406s 2022-08-17T13:09:16.4887829Z 2022-08-17T13:09:16.4887920Z OK (skipped=1) 2022-08-17T13:09:16.4888087Z 2022-08-17T13:09:16.4888213Z Generating XML reports... 2022-08-17T13:09:16.4923783Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130913.xml 2022-08-17T13:09:18.2652351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:18.2652871Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:18.2654017Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:18.2654496Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:18.4427142Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:09:18.4443476Z 2022-08-17T13:09:18.4443618Z Running tests... 2022-08-17T13:09:18.4444374Z ---------------------------------------------------------------------- 2022-08-17T13:09:19.9407344Z test_nccl_errors_blocking_nonzero_exit (__main__.NcclErrorHandlingTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:09:19.9599378Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39623 2022-08-17T13:09:19.9605270Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39624 2022-08-17T13:09:19.9611957Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 39625 2022-08-17T13:09:21.3842237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:21.3843282Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:21.3844455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:21.3845390Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:21.3846583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:21.3847461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:21.3849079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:21.3850032Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:21.4015053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:21.4015918Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:21.4019385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:21.4020376Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:21.5582717Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:09:21.5584289Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:09:21.5783097Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:09:21.9675500Z ok (3.523s) 2022-08-17T13:09:21.9675695Z 2022-08-17T13:09:21.9676108Z ---------------------------------------------------------------------- 2022-08-17T13:09:21.9676446Z Ran 1 test in 3.523s 2022-08-17T13:09:21.9676611Z 2022-08-17T13:09:21.9676707Z OK 2022-08-17T13:09:21.9676841Z 2022-08-17T13:09:21.9676961Z Generating XML reports... 2022-08-17T13:09:21.9711704Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130918.xml 2022-08-17T13:09:23.7492534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:23.7493193Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:23.7493923Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:23.7494612Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:23.9254519Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:09:23.9270957Z 2022-08-17T13:09:23.9271273Z Running tests... 2022-08-17T13:09:23.9271694Z ---------------------------------------------------------------------- 2022-08-17T13:09:25.4371577Z test_nccl_errors_blocking_sigkill (__main__.NcclErrorHandlingTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:09:25.4567271Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39760 2022-08-17T13:09:25.4573145Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39761 2022-08-17T13:09:25.4579525Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 39762 2022-08-17T13:09:26.8613441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:26.8614098Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:26.8614813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:26.8615290Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:26.8971486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:26.8971972Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:26.8974926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:26.8975425Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:26.9282817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:26.9283289Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:26.9286123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:26.9286609Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:27.0284079Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:09:27.0621328Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:09:27.1033585Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:09:27.4642365Z ok (3.537s) 2022-08-17T13:09:27.4642610Z 2022-08-17T13:09:27.4643248Z ---------------------------------------------------------------------- 2022-08-17T13:09:27.4643611Z Ran 1 test in 3.537s 2022-08-17T13:09:27.4643777Z 2022-08-17T13:09:27.4644144Z OK 2022-08-17T13:09:27.4644303Z 2022-08-17T13:09:27.4644429Z Generating XML reports... 2022-08-17T13:09:27.4682124Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130923.xml 2022-08-17T13:09:29.2519015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:29.2520003Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:29.2521213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:29.2522117Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:29.4291359Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:09:29.4309266Z 2022-08-17T13:09:29.4309709Z Running tests... 2022-08-17T13:09:29.4310210Z ---------------------------------------------------------------------- 2022-08-17T13:09:30.9310552Z test_nccl_errors_blocking_sigterm (__main__.NcclErrorHandlingTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:09:30.9507307Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39897 2022-08-17T13:09:30.9514235Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39898 2022-08-17T13:09:30.9520664Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 39899 2022-08-17T13:09:32.3511411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:32.3512361Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:32.3513553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:32.3514501Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:32.3695247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:32.3696155Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:32.3697809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:32.3698771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:32.3881460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:32.3882382Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:32.3883976Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:32.3884936Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:32.5177114Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:09:32.5381054Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:09:32.5541928Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:09:32.8580002Z ok (3.427s) 2022-08-17T13:09:32.8580213Z 2022-08-17T13:09:32.8580614Z ---------------------------------------------------------------------- 2022-08-17T13:09:32.8580956Z Ran 1 test in 3.427s 2022-08-17T13:09:32.8581122Z 2022-08-17T13:09:32.8581217Z OK 2022-08-17T13:09:32.8581353Z 2022-08-17T13:09:32.8582573Z Generating XML reports... 2022-08-17T13:09:32.8617618Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130929.xml 2022-08-17T13:09:34.6487658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:34.6488273Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:34.6489526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:34.6490027Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:34.8258234Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:09:34.8275203Z 2022-08-17T13:09:34.8275663Z Running tests... 2022-08-17T13:09:34.8276181Z ---------------------------------------------------------------------- 2022-08-17T13:09:34.8289467Z test_nccl_errors_nonblocking (__main__.NcclErrorHandlingTest) ... skip: Test does not pass when run locally (0.001s) 2022-08-17T13:09:34.8289784Z 2022-08-17T13:09:34.8290069Z ---------------------------------------------------------------------- 2022-08-17T13:09:34.8290669Z Ran 1 test in 0.001s 2022-08-17T13:09:34.8290834Z 2022-08-17T13:09:34.8290943Z OK (skipped=1) 2022-08-17T13:09:34.8291098Z 2022-08-17T13:09:34.8291225Z Generating XML reports... 2022-08-17T13:09:34.8323398Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130934.xml 2022-08-17T13:09:36.4351152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:36.4351635Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:36.4352668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:36.4353151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:36.6113006Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:09:36.6129011Z 2022-08-17T13:09:36.6129515Z Running tests... 2022-08-17T13:09:36.6130013Z ---------------------------------------------------------------------- 2022-08-17T13:09:38.1161798Z test_nccl_timeout (__main__.NcclErrorHandlingTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:09:38.1356018Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40067 2022-08-17T13:09:38.1362040Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40068 2022-08-17T13:09:38.1368996Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 40069 2022-08-17T13:09:39.5356131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:39.5356631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:39.5357615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:39.5358154Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:39.5372287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:39.5372781Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:39.5375828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:39.5376309Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:39.5621232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:39.5621713Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:39.5624530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:39.5625015Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:39.7057062Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:09:39.7072993Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:09:39.7351006Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:09:40.0425667Z skip: Need at least 3 CUDA devices (3.429s) 2022-08-17T13:09:40.0425892Z 2022-08-17T13:09:40.0426249Z ---------------------------------------------------------------------- 2022-08-17T13:09:40.0426593Z Ran 1 test in 3.430s 2022-08-17T13:09:40.0426759Z 2022-08-17T13:09:40.0426871Z OK (skipped=1) 2022-08-17T13:09:40.0427028Z 2022-08-17T13:09:40.0427156Z Generating XML reports... 2022-08-17T13:09:40.0462021Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130936.xml 2022-08-17T13:09:41.8280288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:41.8281176Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:41.8281787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:41.8282269Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:42.0040067Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:09:42.0055631Z 2022-08-17T13:09:42.0056018Z Running tests... 2022-08-17T13:09:42.0056521Z ---------------------------------------------------------------------- 2022-08-17T13:09:42.0062752Z test_init_no_gpus (__main__.ProcessGroupNCCLNoGPUTest) ... skip: GPUs are available, skipping test (0.001s) 2022-08-17T13:09:42.0063083Z 2022-08-17T13:09:42.0063381Z ---------------------------------------------------------------------- 2022-08-17T13:09:42.0064161Z Ran 1 test in 0.001s 2022-08-17T13:09:42.0064469Z 2022-08-17T13:09:42.0064654Z OK (skipped=1) 2022-08-17T13:09:42.0064924Z 2022-08-17T13:09:42.0065117Z Generating XML reports... 2022-08-17T13:09:42.0097000Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLNoGPUTest-20220817130941.xml 2022-08-17T13:09:43.6627786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:43.6628293Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:43.6629467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:43.6629943Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:43.8396868Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:09:43.8412547Z 2022-08-17T13:09:43.8412763Z Running tests... 2022-08-17T13:09:43.8413220Z ---------------------------------------------------------------------- 2022-08-17T13:09:45.3605376Z test_allgather_base_basics (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:09:45.3798111Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40237 2022-08-17T13:09:45.3804353Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40238 2022-08-17T13:09:46.8455522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:46.8456045Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:46.8456802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:46.8457289Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:46.8561216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:46.8561706Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:46.8564582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:46.8565096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:47.0180869Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:09:47.0184894Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:09:47.0274513Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:09:47.0278171Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:09:47.0279079Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:09:47.0287775Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:09:48.6888841Z ok (4.847s) 2022-08-17T13:09:48.6889061Z 2022-08-17T13:09:48.6889468Z ---------------------------------------------------------------------- 2022-08-17T13:09:48.6889793Z Ran 1 test in 4.847s 2022-08-17T13:09:48.6889959Z 2022-08-17T13:09:48.6890051Z OK 2022-08-17T13:09:48.6890189Z 2022-08-17T13:09:48.6890326Z Generating XML reports... 2022-08-17T13:09:48.6924249Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817130943.xml 2022-08-17T13:09:50.4845339Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:50.4845822Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:50.4846802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:50.4847302Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:50.6614076Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:09:50.6630367Z 2022-08-17T13:09:50.6630826Z Running tests... 2022-08-17T13:09:50.6631324Z ---------------------------------------------------------------------- 2022-08-17T13:09:52.1770688Z test_allgather_base_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:09:52.1968609Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40346 2022-08-17T13:09:52.1974318Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40347 2022-08-17T13:09:53.5857559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:53.5858075Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:53.5859157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:53.5859641Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:53.6137331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:53.6137792Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:53.6140405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:53.6140883Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:53.7516398Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:09:53.7519212Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:09:53.7858750Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:09:53.7862888Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:09:53.7864108Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:09:53.7927344Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:09:56.6085259Z ok (5.945s) 2022-08-17T13:09:56.6085508Z 2022-08-17T13:09:56.6085888Z ---------------------------------------------------------------------- 2022-08-17T13:09:56.6086230Z Ran 1 test in 5.945s 2022-08-17T13:09:56.6086398Z 2022-08-17T13:09:56.6086491Z OK 2022-08-17T13:09:56.6086610Z 2022-08-17T13:09:56.6086745Z Generating XML reports... 2022-08-17T13:09:56.6121339Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817130950.xml 2022-08-17T13:09:58.3763479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:09:58.3764012Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:09:58.3764801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:09:58.3765284Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:09:58.5516841Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:09:58.5532538Z 2022-08-17T13:09:58.5532791Z Running tests... 2022-08-17T13:09:58.5533228Z ---------------------------------------------------------------------- 2022-08-17T13:10:00.0607022Z test_allgather_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:10:00.0802412Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40463 2022-08-17T13:10:00.0808406Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40464 2022-08-17T13:10:01.5251372Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:01.5251889Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:01.5252645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:01.5253123Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:01.5612900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:01.5613365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:01.5615876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:01.5616373Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:01.6935039Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:10:01.6937377Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:10:01.7363976Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:10:01.7367269Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:10:01.7368063Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:10:01.7447694Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:10:04.5922794Z ok (6.039s) 2022-08-17T13:10:04.5923015Z 2022-08-17T13:10:04.5923409Z ---------------------------------------------------------------------- 2022-08-17T13:10:04.5923755Z Ran 1 test in 6.039s 2022-08-17T13:10:04.5923920Z 2022-08-17T13:10:04.5924014Z OK 2022-08-17T13:10:04.5924152Z 2022-08-17T13:10:04.5924288Z Generating XML reports... 2022-08-17T13:10:04.5958817Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817130958.xml 2022-08-17T13:10:06.3774930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:06.3775455Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:06.3776216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:06.3776697Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:06.5533751Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:10:06.5549818Z 2022-08-17T13:10:06.5550204Z Running tests... 2022-08-17T13:10:06.5550643Z ---------------------------------------------------------------------- 2022-08-17T13:10:08.0607722Z test_allreduce_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:10:08.0802384Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40580 2022-08-17T13:10:08.0808375Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40581 2022-08-17T13:10:09.4931739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:09.4932412Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:09.4933381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:09.4933877Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:09.5078739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:09.5079219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:09.5082365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:09.5082843Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:09.6603710Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:10:09.6607062Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:10:09.6795869Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:10:09.6799587Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:10:09.6800642Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:10:09.6812076Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:10:12.5922476Z ok (6.037s) 2022-08-17T13:10:12.5922911Z 2022-08-17T13:10:12.5923528Z ---------------------------------------------------------------------- 2022-08-17T13:10:12.5924119Z Ran 1 test in 6.037s 2022-08-17T13:10:12.5924409Z 2022-08-17T13:10:12.5924569Z OK 2022-08-17T13:10:12.5924814Z 2022-08-17T13:10:12.5925035Z Generating XML reports... 2022-08-17T13:10:12.5960793Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131006.xml 2022-08-17T13:10:14.3698064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:14.3699063Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:14.3700247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:14.3701205Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:14.5468611Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:10:14.5485791Z 2022-08-17T13:10:14.5486221Z Running tests... 2022-08-17T13:10:14.5486737Z ---------------------------------------------------------------------- 2022-08-17T13:10:16.0486766Z test_barrier (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:10:16.0674728Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40697 2022-08-17T13:10:16.0681261Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40698 2022-08-17T13:10:17.4773650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:17.4774634Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:17.4776212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:17.4777155Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:17.5029782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:17.5030687Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:17.5033854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:17.5034858Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:17.6425649Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:10:17.6428994Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:10:17.6775285Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:10:17.6780099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:10:17.6781466Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:10:17.6838008Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:10:20.4794327Z ok (5.931s) 2022-08-17T13:10:20.4794545Z 2022-08-17T13:10:20.4795140Z ---------------------------------------------------------------------- 2022-08-17T13:10:20.4795481Z Ran 1 test in 5.931s 2022-08-17T13:10:20.4795646Z 2022-08-17T13:10:20.4795729Z OK 2022-08-17T13:10:20.4795867Z 2022-08-17T13:10:20.4796013Z Generating XML reports... 2022-08-17T13:10:20.4830791Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131014.xml 2022-08-17T13:10:22.2320254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:22.2320771Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:22.2321774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:22.2322290Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:22.4011649Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:10:22.4027406Z 2022-08-17T13:10:22.4027557Z Running tests... 2022-08-17T13:10:22.4028001Z ---------------------------------------------------------------------- 2022-08-17T13:10:23.8729069Z test_broadcast_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:10:23.8916395Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40814 2022-08-17T13:10:23.8922396Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40815 2022-08-17T13:10:25.2480448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:25.2480992Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:25.2481573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:25.2482048Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:25.3074745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:25.3075223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:25.3077905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:25.3078570Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:25.4165264Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:10:25.4168448Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:10:25.4782489Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:10:25.4786711Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:10:25.4787445Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:10:25.4881758Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:10:28.3043393Z ok (5.901s) 2022-08-17T13:10:28.3043615Z 2022-08-17T13:10:28.3044023Z ---------------------------------------------------------------------- 2022-08-17T13:10:28.3044384Z Ran 1 test in 5.901s 2022-08-17T13:10:28.3044532Z 2022-08-17T13:10:28.3044627Z OK 2022-08-17T13:10:28.3044764Z 2022-08-17T13:10:28.3044900Z Generating XML reports... 2022-08-17T13:10:28.3081745Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131022.xml 2022-08-17T13:10:30.0925140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:30.0926139Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:30.0927340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:30.0928267Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:30.2744860Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:10:30.2762175Z 2022-08-17T13:10:30.2762659Z Running tests... 2022-08-17T13:10:30.2763151Z ---------------------------------------------------------------------- 2022-08-17T13:10:31.7869729Z test_empty_tensors (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:10:31.8067366Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40931 2022-08-17T13:10:31.8073307Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40932 2022-08-17T13:10:33.2121373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:33.2122374Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:33.2123575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:33.2124475Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:33.2290261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:33.2291189Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:33.2294782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:33.2295781Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:33.3802633Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:10:33.3804298Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:10:33.4017983Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:10:33.4022303Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:10:33.4023992Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:10:33.4112395Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:10:36.2183219Z ok (5.942s) 2022-08-17T13:10:36.2183424Z 2022-08-17T13:10:36.2183792Z ---------------------------------------------------------------------- 2022-08-17T13:10:36.2184390Z Ran 1 test in 5.942s 2022-08-17T13:10:36.2184561Z 2022-08-17T13:10:36.2184657Z OK 2022-08-17T13:10:36.2184796Z 2022-08-17T13:10:36.2184934Z Generating XML reports... 2022-08-17T13:10:36.2221216Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131030.xml 2022-08-17T13:10:37.9873038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:37.9873991Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:37.9875190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:37.9876122Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:38.1636022Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:10:38.1653097Z 2022-08-17T13:10:38.1653545Z Running tests... 2022-08-17T13:10:38.1654045Z ---------------------------------------------------------------------- 2022-08-17T13:10:39.6519522Z test_gather_checks (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:10:39.6704593Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41048 2022-08-17T13:10:39.6711121Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41049 2022-08-17T13:10:41.0637308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:41.0637816Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:41.0638614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:41.0639109Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:41.0925328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:41.0925814Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:41.0928844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:41.0929322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:41.2323317Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:10:41.2326288Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:10:41.2657244Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:10:41.2661601Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:10:41.2662352Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:10:41.2734686Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:10:42.9797989Z ok (4.814s) 2022-08-17T13:10:42.9798223Z 2022-08-17T13:10:42.9798611Z ---------------------------------------------------------------------- 2022-08-17T13:10:42.9798966Z Ran 1 test in 4.814s 2022-08-17T13:10:42.9799135Z 2022-08-17T13:10:42.9799230Z OK 2022-08-17T13:10:42.9799371Z 2022-08-17T13:10:42.9799488Z Generating XML reports... 2022-08-17T13:10:42.9834937Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131038.xml 2022-08-17T13:10:44.7574413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:44.7574943Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:44.7575999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:44.7576481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:44.9324298Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:10:44.9340569Z 2022-08-17T13:10:44.9340706Z Running tests... 2022-08-17T13:10:44.9341527Z ---------------------------------------------------------------------- 2022-08-17T13:10:46.4413691Z test_gather_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:10:46.4610258Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41157 2022-08-17T13:10:46.4616737Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41158 2022-08-17T13:10:47.8658115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:47.8658599Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:47.8659587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:47.8660100Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:47.8866995Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:47.8867461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:47.8870724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:47.8871214Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:48.0315722Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:10:48.0318868Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:10:48.0589214Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:10:48.0593131Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:10:48.0593860Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:10:48.0625433Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:10:50.8729471Z ok (5.939s) 2022-08-17T13:10:50.8729724Z 2022-08-17T13:10:50.8730173Z ---------------------------------------------------------------------- 2022-08-17T13:10:50.8730519Z Ran 1 test in 5.939s 2022-08-17T13:10:50.8730685Z 2022-08-17T13:10:50.8730782Z OK 2022-08-17T13:10:50.8730920Z 2022-08-17T13:10:50.8734696Z Generating XML reports... 2022-08-17T13:10:50.8765812Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131044.xml 2022-08-17T13:10:52.6463716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:52.6464544Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:52.6465698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:52.6466204Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:52.8221512Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:10:52.8238642Z 2022-08-17T13:10:52.8238956Z Running tests... 2022-08-17T13:10:52.8239543Z ---------------------------------------------------------------------- 2022-08-17T13:10:54.3371118Z test_gather_stress (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:10:54.3558571Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41278 2022-08-17T13:10:54.3565280Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41279 2022-08-17T13:10:55.7418198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:55.7418706Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:55.7431673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:55.7432171Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:55.7808224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:10:55.7808702Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:10:55.7811907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:10:55.7812385Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:10:55.9088128Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:10:55.9091492Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:10:55.9543018Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:10:55.9547182Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:10:55.9548060Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:10:55.9601673Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:11:02.6752343Z ok (9.851s) 2022-08-17T13:11:02.6752665Z 2022-08-17T13:11:02.6753425Z ---------------------------------------------------------------------- 2022-08-17T13:11:02.6753946Z Ran 1 test in 9.851s 2022-08-17T13:11:02.6754119Z 2022-08-17T13:11:02.6754217Z OK 2022-08-17T13:11:02.6754355Z 2022-08-17T13:11:02.6754837Z Generating XML reports... 2022-08-17T13:11:02.6788051Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131052.xml 2022-08-17T13:11:04.4503701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:04.4504445Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:04.4505433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:04.4505951Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:04.6269348Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:11:04.6284470Z 2022-08-17T13:11:04.6284915Z Running tests... 2022-08-17T13:11:04.6285420Z ---------------------------------------------------------------------- 2022-08-17T13:11:06.1344994Z test_reduce_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:11:06.1541698Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41399 2022-08-17T13:11:06.1548295Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41400 2022-08-17T13:11:07.5447910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:07.5448743Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:07.5449972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:07.5450596Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:07.5798034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:07.5798832Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:07.5800439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:07.5801236Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:07.7117074Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:11:07.7120252Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:11:07.7534059Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:11:07.7537816Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:11:07.7539255Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:11:07.7631016Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:11:10.5660863Z ok (5.937s) 2022-08-17T13:11:10.5661248Z 2022-08-17T13:11:10.5661986Z ---------------------------------------------------------------------- 2022-08-17T13:11:10.5662543Z Ran 1 test in 5.938s 2022-08-17T13:11:10.5662713Z 2022-08-17T13:11:10.5662808Z OK 2022-08-17T13:11:10.5662944Z 2022-08-17T13:11:10.5663060Z Generating XML reports... 2022-08-17T13:11:10.5697850Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131104.xml 2022-08-17T13:11:12.3555304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:12.3556012Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:12.3557183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:12.3557699Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:12.5306551Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:11:12.5322470Z 2022-08-17T13:11:12.5322738Z Running tests... 2022-08-17T13:11:12.5323356Z ---------------------------------------------------------------------- 2022-08-17T13:11:14.0289274Z test_reduce_scatter_base_basics (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:11:14.0483814Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41516 2022-08-17T13:11:14.0490113Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41517 2022-08-17T13:11:15.4634466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:15.4635056Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:15.4636305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:15.4636824Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:15.4782000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:15.4782764Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:15.4784972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:15.4785877Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:15.6292571Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:11:15.6295766Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:11:15.6493894Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:11:15.6497694Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:11:15.6498968Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:11:15.6500492Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:11:17.3576456Z ok (4.825s) 2022-08-17T13:11:17.3576911Z 2022-08-17T13:11:17.3577617Z ---------------------------------------------------------------------- 2022-08-17T13:11:17.3578010Z Ran 1 test in 4.825s 2022-08-17T13:11:17.3578178Z 2022-08-17T13:11:17.3578253Z OK 2022-08-17T13:11:17.3578388Z 2022-08-17T13:11:17.3578541Z Generating XML reports... 2022-08-17T13:11:17.3613741Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131112.xml 2022-08-17T13:11:19.1275884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:19.1276600Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:19.1277583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:19.1278331Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:19.3028315Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:11:19.3044236Z 2022-08-17T13:11:19.3044697Z Running tests... 2022-08-17T13:11:19.3045187Z ---------------------------------------------------------------------- 2022-08-17T13:11:20.8004930Z test_reduce_scatter_base_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:11:20.8200875Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41625 2022-08-17T13:11:20.8206939Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41626 2022-08-17T13:11:22.1707277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:22.1707796Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:22.1709076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:22.1709570Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:22.1987536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:22.1988314Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:22.1990742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:22.1991228Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:22.3374505Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:11:22.3377191Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:11:22.3741149Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:11:22.3745518Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:11:22.3746806Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:11:22.3784961Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:11:25.2335084Z ok (5.929s) 2022-08-17T13:11:25.2335304Z 2022-08-17T13:11:25.2335720Z ---------------------------------------------------------------------- 2022-08-17T13:11:25.2336066Z Ran 1 test in 5.929s 2022-08-17T13:11:25.2336239Z 2022-08-17T13:11:25.2336314Z OK 2022-08-17T13:11:25.2336449Z 2022-08-17T13:11:25.2336589Z Generating XML reports... 2022-08-17T13:11:25.2373046Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131119.xml 2022-08-17T13:11:26.9944332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:26.9945023Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:26.9945653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:26.9946143Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:27.1710138Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:11:27.1725939Z 2022-08-17T13:11:27.1726186Z Running tests... 2022-08-17T13:11:27.1726635Z ---------------------------------------------------------------------- 2022-08-17T13:11:28.6937197Z test_reduce_scatter_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:11:28.7133287Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41742 2022-08-17T13:11:28.7139093Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41743 2022-08-17T13:11:30.1563452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:30.1563957Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:30.1564571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:30.1565052Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:30.1879679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:30.1880131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:30.1883070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:30.1883557Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:30.3221658Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:11:30.3225202Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:11:30.3588146Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:11:30.3592416Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:11:30.3593156Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:11:30.3633858Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:11:33.1251666Z ok (5.952s) 2022-08-17T13:11:33.1251888Z 2022-08-17T13:11:33.1252531Z ---------------------------------------------------------------------- 2022-08-17T13:11:33.1252882Z Ran 1 test in 5.952s 2022-08-17T13:11:33.1253052Z 2022-08-17T13:11:33.1253147Z OK 2022-08-17T13:11:33.1253264Z 2022-08-17T13:11:33.1253399Z Generating XML reports... 2022-08-17T13:11:33.1289157Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131127.xml 2022-08-17T13:11:34.8757113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:34.8757618Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:34.8758382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:34.8758845Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:35.0444157Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:11:35.0459198Z 2022-08-17T13:11:35.0459522Z Running tests... 2022-08-17T13:11:35.0459993Z ---------------------------------------------------------------------- 2022-08-17T13:11:36.5066090Z test_scatter_checks (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:11:36.5252236Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41859 2022-08-17T13:11:36.5258036Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41860 2022-08-17T13:11:37.9960060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:37.9960562Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:37.9961584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:37.9962064Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:38.0207537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:38.0208009Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:38.0211137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:38.0211634Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:38.1685270Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:11:38.1688092Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:11:38.1919938Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:11:38.1923565Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:11:38.1924416Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:11:38.1994779Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:11:39.9345086Z ok (4.888s) 2022-08-17T13:11:39.9345278Z 2022-08-17T13:11:39.9345682Z ---------------------------------------------------------------------- 2022-08-17T13:11:39.9346051Z Ran 1 test in 4.888s 2022-08-17T13:11:39.9346224Z 2022-08-17T13:11:39.9346607Z OK 2022-08-17T13:11:39.9346767Z 2022-08-17T13:11:39.9346886Z Generating XML reports... 2022-08-17T13:11:39.9380819Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131135.xml 2022-08-17T13:11:41.7157713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:41.7158338Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:41.7159411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:41.7159906Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:41.8907108Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:11:41.8922892Z 2022-08-17T13:11:41.8923295Z Running tests... 2022-08-17T13:11:41.8923788Z ---------------------------------------------------------------------- 2022-08-17T13:11:43.4032582Z test_scatter_ops (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:11:43.4227911Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41968 2022-08-17T13:11:43.4234356Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41969 2022-08-17T13:11:44.8972130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:44.8972651Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:44.8973605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:44.8974110Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:44.9124506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:44.9124991Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:44.9127578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:44.9128060Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:45.0631603Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:11:45.0635101Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:11:45.0844895Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:11:45.0848406Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:11:45.0849497Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:11:45.0941924Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:11:47.9349486Z ok (6.042s) 2022-08-17T13:11:47.9349816Z 2022-08-17T13:11:47.9350394Z ---------------------------------------------------------------------- 2022-08-17T13:11:47.9350772Z Ran 1 test in 6.043s 2022-08-17T13:11:47.9350921Z 2022-08-17T13:11:47.9351020Z OK 2022-08-17T13:11:47.9351156Z 2022-08-17T13:11:47.9351297Z Generating XML reports... 2022-08-17T13:11:47.9385001Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131141.xml 2022-08-17T13:11:49.7140936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:49.7141465Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:49.7142278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:49.7143063Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:49.8892923Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:11:49.8908988Z 2022-08-17T13:11:49.8909236Z Running tests... 2022-08-17T13:11:49.8909658Z ---------------------------------------------------------------------- 2022-08-17T13:11:51.4005905Z test_scatter_stress (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:11:51.4200961Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42089 2022-08-17T13:11:51.4206977Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42090 2022-08-17T13:11:52.8239754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:52.8240616Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:52.8241458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:52.8241959Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:52.8511382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:11:52.8511860Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:11:52.8515176Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:11:52.8515648Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:11:52.9900196Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:11:52.9903222Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:11:53.0248615Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:11:53.0253859Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:11:53.0255204Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:11:53.0311916Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:11:59.7396250Z ok (9.848s) 2022-08-17T13:11:59.7396459Z 2022-08-17T13:11:59.7396867Z ---------------------------------------------------------------------- 2022-08-17T13:11:59.7397219Z Ran 1 test in 9.849s 2022-08-17T13:11:59.7397389Z 2022-08-17T13:11:59.7397486Z OK 2022-08-17T13:11:59.7399613Z 2022-08-17T13:11:59.7400162Z Generating XML reports... 2022-08-17T13:11:59.7431936Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131149.xml 2022-08-17T13:12:01.4880093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:01.4880614Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:01.4881980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:01.4882465Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:01.6589936Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:12:01.6603969Z 2022-08-17T13:12:01.6604210Z Running tests... 2022-08-17T13:12:01.6604649Z ---------------------------------------------------------------------- 2022-08-17T13:12:03.1484248Z test_send_recv (__main__.ProcessGroupNCCLTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:12:03.1672529Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42210 2022-08-17T13:12:03.1678567Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42211 2022-08-17T13:12:04.6496701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:04.6497257Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:04.6498094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:04.6498585Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:04.6553680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:04.6554151Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:04.6556927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:04.6557413Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:04.8185852Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:12:04.8188758Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:12:04.8245404Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:12:04.8249001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:12:04.8250180Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:12:04.8291777Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:12:06.6770325Z ok (5.016s) 2022-08-17T13:12:06.6770565Z 2022-08-17T13:12:06.6770969Z ---------------------------------------------------------------------- 2022-08-17T13:12:06.6771313Z Ran 1 test in 5.017s 2022-08-17T13:12:06.6771489Z 2022-08-17T13:12:06.6771608Z OK 2022-08-17T13:12:06.6771728Z 2022-08-17T13:12:06.6771869Z Generating XML reports... 2022-08-17T13:12:06.6806674Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131201.xml 2022-08-17T13:12:08.4205460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:08.4205952Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:08.4207650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:08.4208138Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:08.5888602Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:12:08.5903959Z 2022-08-17T13:12:08.5904359Z Running tests... 2022-08-17T13:12:08.5904878Z ---------------------------------------------------------------------- 2022-08-17T13:12:10.0552662Z test_common_errors (__main__.RendezvousEnvTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:12:10.0734673Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:12:10.0735634Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:12:10.0758592Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:12:10.0759434Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:12:10.0777583Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:12:10.0778850Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:12:10.0796515Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:12:10.0797515Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:12:10.0863482Z ok (1.496s) 2022-08-17T13:12:10.0865339Z 2022-08-17T13:12:10.0865667Z ---------------------------------------------------------------------- 2022-08-17T13:12:10.0866027Z Ran 1 test in 1.496s 2022-08-17T13:12:10.0866197Z 2022-08-17T13:12:10.0866294Z OK 2022-08-17T13:12:10.0866410Z 2022-08-17T13:12:10.0866541Z Generating XML reports... 2022-08-17T13:12:10.0897717Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-RendezvousEnvTest-20220817131208.xml 2022-08-17T13:12:11.8318134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:11.8318660Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:11.8319462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:11.8319954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:12.0067530Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_nccl 2022-08-17T13:12:12.0083046Z 2022-08-17T13:12:12.0083353Z Running tests... 2022-08-17T13:12:12.0083844Z ---------------------------------------------------------------------- 2022-08-17T13:12:13.5190872Z test_default_store_timeout_nccl (__main__.TimeoutTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:12:13.5368194Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:12:13.5368992Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:12:15.5511917Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:12:15.5512744Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:12:16.5745221Z ok (4.566s) 2022-08-17T13:12:16.5745493Z 2022-08-17T13:12:16.5745858Z ---------------------------------------------------------------------- 2022-08-17T13:12:16.5746200Z Ran 1 test in 4.566s 2022-08-17T13:12:16.5746377Z 2022-08-17T13:12:16.5746454Z OK 2022-08-17T13:12:16.5746592Z 2022-08-17T13:12:16.5746724Z Generating XML reports... 2022-08-17T13:12:16.5782651Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_nccl/TEST-TimeoutTest-20220817131212.xml 2022-08-17T13:12:17.1477785Z Running distributed/test_c10d_gloo ... [2022-08-17 13:12:17.147314] 2022-08-17T13:12:17.1478725Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_gloo.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:12:17.147397] 2022-08-17T13:12:18.7710075Z , <__main__.CommTest testMethod=test_broadcast_coalesced_gloo_cuda>, <__main__.CommTest testMethod=test_gloo_barrier_device_ids>, <__main__.CommTest testMethod=test_gloo_warn_not_in_group>, <__main__.CommTest testMethod=test_sequence_num_incremented_gloo_default>, <__main__.CommTest testMethod=test_sequence_num_incremented_gloo_subgroup>, <__main__.CommTest testMethod=test_sequence_num_set_default_pg_gloo>, <__main__.CommTest testMethod=test_sequence_num_set_gloo_new_group>]> 2022-08-17T13:12:18.7711261Z test_broadcast_coalesced_gloo_cpu (__main__.CommTest) 2022-08-17T13:12:18.7711912Z test_broadcast_coalesced_gloo_cuda (__main__.CommTest) 2022-08-17T13:12:18.7712261Z test_gloo_barrier_device_ids (__main__.CommTest) 2022-08-17T13:12:18.7712604Z test_gloo_warn_not_in_group (__main__.CommTest) 2022-08-17T13:12:18.7712963Z test_sequence_num_incremented_gloo_default (__main__.CommTest) 2022-08-17T13:12:18.7713611Z test_sequence_num_incremented_gloo_subgroup (__main__.CommTest) 2022-08-17T13:12:18.7713991Z test_sequence_num_set_default_pg_gloo (__main__.CommTest) 2022-08-17T13:12:18.7714354Z test_sequence_num_set_gloo_new_group (__main__.CommTest) 2022-08-17T13:12:18.7719724Z , <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_dynamic_weight_sharing>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_once_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_once_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_static_graph_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_static_graph_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_weight_sharing>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_unused_params_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_unused_params_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_weight_sharing_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_weight_sharing_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_future_passing_cpu>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_future_passing_gpu_gloo>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_register_just_once>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_sparse_gradients>, <__main__.DistributedDataParallelTest testMethod=test_ddp_invalid_comm_hook_init>, <__main__.DistributedDataParallelTest testMethod=test_ddp_invalid_comm_hook_return_type>, <__main__.DistributedDataParallelTest testMethod=test_find_unused_parameters_when_unused_parameters_empty>, <__main__.DistributedDataParallelTest testMethod=test_global_local_unused_params_grad>, <__main__.DistributedDataParallelTest testMethod=test_global_local_unused_params_grad_with_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_global_local_unused_params_grad_with_static_graph>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_1gpu_module_device_ids_integer_list>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_1gpu_module_device_ids_torch_device_list>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_2gpu_module>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_4gpu_module>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_cpu_module>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_cpu_module_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_ignored_output>, <__main__.DistributedDataParallelTest testMethod=test_ignored_output_with_unused_parameters>, <__main__.DistributedDataParallelTest testMethod=test_invalid_powerSGD_state>, <__main__.DistributedDataParallelTest testMethod=test_save_load_checkpoint>, <__main__.DistributedDataParallelTest testMethod=test_sparse_gradients>, <__main__.DistributedDataParallelTest testMethod=test_sparse_gradients_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_sync_batch_norm_empty_input>, <__main__.DistributedDataParallelTest testMethod=test_sync_batch_norm_only_empty_input>]> 2022-08-17T13:12:18.7725206Z test_ddp_checkpointing_dynamic_module (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7725694Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7726231Z test_ddp_checkpointing_once_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7726704Z test_ddp_checkpointing_once_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7727261Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7727783Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7728289Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7728771Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7729235Z test_ddp_checkpointing_twice_weight_sharing (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7729789Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7730296Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7730792Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7731297Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7731774Z test_ddp_comm_hook_future_passing_cpu (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7732228Z test_ddp_comm_hook_future_passing_gpu_gloo (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7732661Z test_ddp_comm_hook_register_just_once (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7733104Z test_ddp_comm_hook_sparse_gradients (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7733542Z test_ddp_invalid_comm_hook_init (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7733966Z test_ddp_invalid_comm_hook_return_type (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7734441Z test_find_unused_parameters_when_unused_parameters_empty (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7734912Z test_global_local_unused_params_grad (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7735383Z test_global_local_unused_params_grad_with_grad_is_view (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7735857Z test_global_local_unused_params_grad_with_static_graph (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7736343Z test_gloo_backend_1gpu_module_device_ids_integer_list (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7736837Z test_gloo_backend_1gpu_module_device_ids_torch_device_list (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7737279Z test_gloo_backend_2gpu_module (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7737703Z test_gloo_backend_4gpu_module (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7738127Z test_gloo_backend_cpu_module (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7738565Z test_gloo_backend_cpu_module_grad_is_view (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7738975Z test_ignored_output (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7739407Z test_ignored_output_with_unused_parameters (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7739847Z test_invalid_powerSGD_state (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7740252Z test_save_load_checkpoint (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7740655Z test_sparse_gradients (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7741078Z test_sparse_gradients_grad_is_view (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7741495Z test_sync_batch_norm_empty_input (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7741931Z test_sync_batch_norm_only_empty_input (__main__.DistributedDataParallelTest) 2022-08-17T13:12:18.7742306Z 2022-08-17T13:12:18.7747905Z , <__main__.ProcessGroupGlooTest testMethod=test_allgather_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_checks>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_coalesced_async>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_coalesced_checks>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_noncontiguous_input>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_stress>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_basics>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_basics_cuda_using_work_api>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_basics_using_work_api>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_checks>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_async>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_basics>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_checks>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_checks_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_stress>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_stress>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_barrier_implies_wait>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_basics>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_checks>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_stress>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_empty_tensors>, <__main__.ProcessGroupGlooTest testMethod=test_gather_basics>, <__main__.ProcessGroupGlooTest testMethod=test_gather_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_gather_checks>, <__main__.ProcessGroupGlooTest testMethod=test_gather_noncontiguous_input>, <__main__.ProcessGroupGlooTest testMethod=test_gather_stress>, <__main__.ProcessGroupGlooTest testMethod=test_gather_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_multi_device_constructor>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_basics>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_checks>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_stress>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_round_robin>, <__main__.ProcessGroupGlooTest testMethod=test_round_robin_create_destroy>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_basics>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_checks>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_stress>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_send_recv_all_to_all>, <__main__.ProcessGroupGlooTest testMethod=test_sparse_allreduce_basics>, <__main__.ProcessGroupGlooTest testMethod=test_sparse_allreduce_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_sparse_allreduce_checks>]> 2022-08-17T13:12:18.7753047Z test_allgather_basics (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7753433Z test_allgather_basics_cuda (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7753808Z test_allgather_checks (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7754176Z test_allgather_coalesced_async (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7754570Z test_allgather_coalesced_checks (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7754978Z test_allgather_noncontiguous_input (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7755357Z test_allgather_stress (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7755736Z test_allgather_stress_cuda (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7756159Z test_allreduce_basics (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7756524Z test_allreduce_basics_cuda (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7756923Z test_allreduce_basics_cuda_using_work_api (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7757343Z test_allreduce_basics_using_work_api (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7757734Z test_allreduce_checks (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7758097Z test_allreduce_coalesced_async (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7758488Z test_allreduce_coalesced_basics (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7758881Z test_allreduce_coalesced_checks (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7759269Z test_allreduce_coalesced_checks_cuda (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7759735Z test_allreduce_coalesced_stress (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7760110Z test_allreduce_stress (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7760472Z test_allreduce_stress_cuda (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7760849Z test_barrier_implies_wait (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7761216Z test_broadcast_basics (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7761586Z test_broadcast_basics_cuda (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7761938Z test_broadcast_checks (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7762301Z test_broadcast_stress (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7762667Z test_broadcast_stress_cuda (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7763014Z test_empty_tensors (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7763367Z test_gather_basics (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7763728Z test_gather_basics_cuda (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7764078Z test_gather_checks (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7764455Z test_gather_noncontiguous_input (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7764830Z test_gather_stress (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7765176Z test_gather_stress_cuda (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7765559Z test_multi_device_constructor (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7765927Z test_reduce_basics (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7766287Z test_reduce_basics_cuda (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7766629Z test_reduce_checks (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7766975Z test_reduce_stress (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7767327Z test_reduce_stress_cuda (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7767670Z test_round_robin (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7768039Z test_round_robin_create_destroy (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7768413Z test_scatter_basics (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7768760Z test_scatter_basics_cuda (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7769121Z test_scatter_checks (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7769476Z test_scatter_stress (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7769822Z test_scatter_stress_cuda (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7770190Z test_send_recv_all_to_all (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7770567Z test_sparse_allreduce_basics (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7770960Z test_sparse_allreduce_basics_cuda (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7771340Z test_sparse_allreduce_checks (__main__.ProcessGroupGlooTest) 2022-08-17T13:12:18.7772195Z , <__main__.ReducerTest testMethod=test_forward_backward_optimizer>, <__main__.ReducerTest testMethod=test_forward_backward_unused_parameters>, <__main__.ReducerTest testMethod=test_multi_dtype_multi_bucket>, <__main__.ReducerTest testMethod=test_multi_dtype_single_bucket>, <__main__.ReducerTest testMethod=test_single_dtype_single_bucket>]> 2022-08-17T13:12:18.7773043Z test_forward_backward (__main__.ReducerTest) 2022-08-17T13:12:18.7773393Z test_forward_backward_optimizer (__main__.ReducerTest) 2022-08-17T13:12:18.7773745Z test_forward_backward_unused_parameters (__main__.ReducerTest) 2022-08-17T13:12:18.7774104Z test_multi_dtype_multi_bucket (__main__.ReducerTest) 2022-08-17T13:12:18.7774447Z test_multi_dtype_single_bucket (__main__.ReducerTest) 2022-08-17T13:12:18.7774794Z test_single_dtype_single_bucket (__main__.ReducerTest) 2022-08-17T13:12:18.7775197Z ]> 2022-08-17T13:12:18.7775602Z test_logging_init (__main__.RendezvousEnvTest) 2022-08-17T13:12:18.7775925Z 2022-08-17T13:12:18.7776325Z ]> 2022-08-17T13:12:18.7776801Z test_default_store_timeout_gloo (__main__.TimeoutTest) 2022-08-17T13:12:20.1961675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:20.1962186Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:20.1964520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:20.1965012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:20.3716156Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:12:20.3731053Z 2022-08-17T13:12:20.3731284Z Running tests... 2022-08-17T13:12:20.3731710Z ---------------------------------------------------------------------- 2022-08-17T13:12:21.8749329Z test_broadcast_coalesced_gloo_cpu (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:12:21.8944744Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42453 2022-08-17T13:12:21.8951410Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42454 2022-08-17T13:12:23.3409202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:23.3410148Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:23.3418770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:23.3419743Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:23.3647668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:23.3648591Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:23.3660088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:23.3661101Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:23.5058058Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:12:23.5300303Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:12:23.9006096Z ok (3.527s) 2022-08-17T13:12:23.9006305Z 2022-08-17T13:12:23.9006696Z ---------------------------------------------------------------------- 2022-08-17T13:12:23.9007059Z Ran 1 test in 3.527s 2022-08-17T13:12:23.9007208Z 2022-08-17T13:12:23.9007304Z OK 2022-08-17T13:12:23.9007438Z 2022-08-17T13:12:23.9007573Z Generating XML reports... 2022-08-17T13:12:23.9043351Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20220817131220.xml 2022-08-17T13:12:25.6534271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:25.6534786Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:25.6537390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:25.6537899Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:25.8288978Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:12:25.8303873Z 2022-08-17T13:12:25.8304087Z Running tests... 2022-08-17T13:12:25.8304517Z ---------------------------------------------------------------------- 2022-08-17T13:12:27.3514491Z test_broadcast_coalesced_gloo_cuda (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:12:27.3708948Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42562 2022-08-17T13:12:27.3715130Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42563 2022-08-17T13:12:28.7657778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:28.7658308Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:28.7667025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:28.7667513Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:28.7673205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:28.7673667Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:28.7684185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:28.7684671Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:28.9371112Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:12:28.9374934Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:12:30.6800919Z ok (4.849s) 2022-08-17T13:12:30.6801141Z 2022-08-17T13:12:30.6801523Z ---------------------------------------------------------------------- 2022-08-17T13:12:30.6801845Z Ran 1 test in 4.850s 2022-08-17T13:12:30.6802016Z 2022-08-17T13:12:30.6802110Z OK 2022-08-17T13:12:30.6802243Z 2022-08-17T13:12:30.6802382Z Generating XML reports... 2022-08-17T13:12:30.6836350Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20220817131225.xml 2022-08-17T13:12:32.4645710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:32.4646215Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:32.4648402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:32.4648918Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:32.6413822Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:12:32.6429882Z 2022-08-17T13:12:32.6430076Z Running tests... 2022-08-17T13:12:32.6430506Z ---------------------------------------------------------------------- 2022-08-17T13:12:34.1526855Z test_gloo_barrier_device_ids (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:12:34.1721737Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42673 2022-08-17T13:12:34.1727895Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42674 2022-08-17T13:12:35.5810854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:35.5811357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:35.5820412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:35.5821190Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:35.6086436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:35.6086898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:35.6098204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:35.6098665Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:35.7462476Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:12:35.7801008Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:12:35.8014247Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:12:35.8014744Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:12:35.8015473Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:12:35.8016168Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:12:36.1785102Z ok (3.535s) 2022-08-17T13:12:36.1785298Z 2022-08-17T13:12:36.1785671Z ---------------------------------------------------------------------- 2022-08-17T13:12:36.1786014Z Ran 1 test in 3.535s 2022-08-17T13:12:36.1786180Z 2022-08-17T13:12:36.1786263Z OK 2022-08-17T13:12:36.1786400Z 2022-08-17T13:12:36.1786531Z Generating XML reports... 2022-08-17T13:12:36.1821434Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20220817131232.xml 2022-08-17T13:12:37.9305517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:37.9306053Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:37.9308645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:37.9309134Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:38.1081771Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:12:38.1097140Z 2022-08-17T13:12:38.1097366Z Running tests... 2022-08-17T13:12:38.1097801Z ---------------------------------------------------------------------- 2022-08-17T13:12:39.6224965Z test_gloo_warn_not_in_group (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:12:39.6421795Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42782 2022-08-17T13:12:39.6428295Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42783 2022-08-17T13:12:41.0474270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:41.0474784Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:41.0483814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:41.0484302Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:41.0728622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:41.0729092Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:41.0740085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:41.0740563Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:41.2134154Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:12:41.2463589Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:12:41.2573879Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:12:41.2574374Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:12:41.2575138Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:12:41.2575828Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:12:41.2576875Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:12:41.2580606Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:12:41.2581349Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:12:41.2680375Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:12:42.9514933Z ok (4.841s) 2022-08-17T13:12:42.9515132Z 2022-08-17T13:12:42.9515501Z ---------------------------------------------------------------------- 2022-08-17T13:12:42.9515859Z Ran 1 test in 4.842s 2022-08-17T13:12:42.9516025Z 2022-08-17T13:12:42.9516120Z OK 2022-08-17T13:12:42.9516252Z 2022-08-17T13:12:42.9516384Z Generating XML reports... 2022-08-17T13:12:42.9552333Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20220817131238.xml 2022-08-17T13:12:44.7161638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:44.7162164Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:44.7165029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:44.7165517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:44.8901889Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:12:44.8917423Z 2022-08-17T13:12:44.8917644Z Running tests... 2022-08-17T13:12:44.8918062Z ---------------------------------------------------------------------- 2022-08-17T13:12:46.3955539Z test_sequence_num_incremented_gloo_default (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:12:46.4149646Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42896 2022-08-17T13:12:46.4155856Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42897 2022-08-17T13:12:47.8118793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:47.8119313Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:47.8128349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:47.8128812Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:47.8438231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:47.8438691Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:47.8450109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:47.8450615Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:47.9785232Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:12:48.0157369Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:12:48.0274572Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:12:48.0275108Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:12:48.0275854Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:12:48.0276554Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:12:48.0587147Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:12:48.0587668Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:12:48.0588516Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:12:48.0589216Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:12:49.8244109Z ok (4.932s) 2022-08-17T13:12:49.8244465Z 2022-08-17T13:12:49.8244937Z ---------------------------------------------------------------------- 2022-08-17T13:12:49.8245288Z Ran 1 test in 4.933s 2022-08-17T13:12:49.8245457Z 2022-08-17T13:12:49.8245554Z OK 2022-08-17T13:12:49.8245691Z 2022-08-17T13:12:49.8246304Z Generating XML reports... 2022-08-17T13:12:49.8280558Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20220817131244.xml 2022-08-17T13:12:51.5642534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:51.5643045Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:51.5645367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:51.5645849Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:51.7317311Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:12:51.7331618Z 2022-08-17T13:12:51.7331863Z Running tests... 2022-08-17T13:12:51.7332292Z ---------------------------------------------------------------------- 2022-08-17T13:12:53.2056135Z test_sequence_num_incremented_gloo_subgroup (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:12:53.2241110Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43013 2022-08-17T13:12:53.2247370Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43014 2022-08-17T13:12:54.6106846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:54.6107838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:54.6118889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:54.6119891Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:54.6291201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:54.6292156Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:54.6302167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:54.6303141Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:54.7842140Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:12:54.7957612Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:12:55.1300225Z skip: Need at least 4 CUDA devices (3.397s) 2022-08-17T13:12:55.1300440Z 2022-08-17T13:12:55.1301293Z ---------------------------------------------------------------------- 2022-08-17T13:12:55.1301678Z Ran 1 test in 3.397s 2022-08-17T13:12:55.1301842Z 2022-08-17T13:12:55.1301961Z OK (skipped=1) 2022-08-17T13:12:55.1302117Z 2022-08-17T13:12:55.1302227Z Generating XML reports... 2022-08-17T13:12:55.1336742Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20220817131251.xml 2022-08-17T13:12:56.8818418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:56.8818925Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:56.8820969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:56.8821761Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:57.0512205Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:12:57.0527339Z 2022-08-17T13:12:57.0527741Z Running tests... 2022-08-17T13:12:57.0528228Z ---------------------------------------------------------------------- 2022-08-17T13:12:58.5270202Z test_sequence_num_set_default_pg_gloo (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:12:58.5457084Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43116 2022-08-17T13:12:58.5462887Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43117 2022-08-17T13:12:59.9587962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:59.9588466Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:59.9598798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:59.9599276Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:12:59.9779964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:12:59.9780425Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:12:59.9791294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:12:59.9791774Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:00.1308548Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:13:00.1435804Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:13:00.1623469Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:13:00.1623998Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:13:00.1624752Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:13:00.1625426Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:13:00.4520290Z ok (3.399s) 2022-08-17T13:13:00.4520500Z 2022-08-17T13:13:00.4520881Z ---------------------------------------------------------------------- 2022-08-17T13:13:00.4521223Z Ran 1 test in 3.399s 2022-08-17T13:13:00.4521374Z 2022-08-17T13:13:00.4521469Z OK 2022-08-17T13:13:00.4521604Z 2022-08-17T13:13:00.4521745Z Generating XML reports... 2022-08-17T13:13:00.4557450Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20220817131257.xml 2022-08-17T13:13:02.1954661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:02.1955180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:02.1957476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:02.1957995Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:02.3654642Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:13:02.3668830Z 2022-08-17T13:13:02.3669101Z Running tests... 2022-08-17T13:13:02.3669541Z ---------------------------------------------------------------------- 2022-08-17T13:13:03.8443688Z test_sequence_num_set_gloo_new_group (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:13:03.8632156Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43225 2022-08-17T13:13:03.8638149Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43226 2022-08-17T13:13:05.2620303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:05.2620833Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:05.2629573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:05.2630058Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:05.2835862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:05.2836329Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:05.2847592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:05.2848078Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:05.4272466Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:13:05.4534762Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:13:05.4748038Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:13:05.4748555Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:13:05.4749277Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:13:05.4749958Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:13:05.5060400Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:13:05.5060914Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:13:05.5061607Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:13:05.5062284Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:13:05.8694163Z ok (3.502s) 2022-08-17T13:13:05.8694372Z 2022-08-17T13:13:05.8694774Z ---------------------------------------------------------------------- 2022-08-17T13:13:05.8695100Z Ran 1 test in 3.502s 2022-08-17T13:13:05.8695271Z 2022-08-17T13:13:05.8695366Z OK 2022-08-17T13:13:05.8695502Z 2022-08-17T13:13:05.8695638Z Generating XML reports... 2022-08-17T13:13:05.8730472Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20220817131302.xml 2022-08-17T13:13:07.6307323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:07.6307844Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:07.6310546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:07.6311049Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:07.8074420Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:13:07.8089532Z 2022-08-17T13:13:07.8089836Z Running tests... 2022-08-17T13:13:07.8090251Z ---------------------------------------------------------------------- 2022-08-17T13:13:07.8097938Z test_ddp_checkpointing_dynamic_module (__main__.DistributedDataParallelTest) 2022-08-17T13:13:09.3254978Z Dynamic module can be checkpointed, multiple times, with non-reentrant ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:13:09.3450392Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43340 2022-08-17T13:13:09.3456856Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43341 2022-08-17T13:13:10.7491025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:10.7491549Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:10.7499791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:10.7500272Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:10.7841706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:10.7842509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:10.7853637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:10.7854114Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:10.9149367Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:13:10.9563382Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:13:12.2743559Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvcp380he 2022-08-17T13:13:12.2744729Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvcp380he/_remote_module_non_scriptable.py 2022-08-17T13:13:12.3009899Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw26qwmfe 2022-08-17T13:13:12.3012431Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw26qwmfe/_remote_module_non_scriptable.py 2022-08-17T13:13:12.7344401Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:13:12.7345017Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:13:12.7456202Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:13:12.7456778Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:13:13.2559693Z ok (5.447s) 2022-08-17T13:13:13.2560062Z 2022-08-17T13:13:13.2560780Z ---------------------------------------------------------------------- 2022-08-17T13:13:13.2561259Z Ran 1 test in 5.447s 2022-08-17T13:13:13.2561428Z 2022-08-17T13:13:13.2561520Z OK 2022-08-17T13:13:13.2561654Z 2022-08-17T13:13:13.2561788Z Generating XML reports... 2022-08-17T13:13:13.2596458Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131307.xml 2022-08-17T13:13:15.0587769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:15.0588295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:15.0591338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:15.0591863Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:15.2352486Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:13:15.2366826Z 2022-08-17T13:13:15.2367041Z Running tests... 2022-08-17T13:13:15.2367907Z ---------------------------------------------------------------------- 2022-08-17T13:13:15.2375417Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.DistributedDataParallelTest) 2022-08-17T13:13:16.7399319Z Dynamic module can be checkpointed multiple times with weight sharing ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:13:16.7585714Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43455 2022-08-17T13:13:16.7592383Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43456 2022-08-17T13:13:18.1651174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:18.1651716Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:18.1660088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:18.1660572Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:18.1839828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:18.1840274Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:18.1851529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:18.1851998Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:18.3299070Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:13:18.3555011Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:13:19.6631795Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg05yefnu 2022-08-17T13:13:19.6632397Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg05yefnu/_remote_module_non_scriptable.py 2022-08-17T13:13:19.6763273Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp73x7w689 2022-08-17T13:13:19.6766251Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp73x7w689/_remote_module_non_scriptable.py 2022-08-17T13:13:20.1167107Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:13:20.1167710Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:13:20.1189368Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:13:20.1189919Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:13:20.6691361Z ok (5.432s) 2022-08-17T13:13:20.6691527Z 2022-08-17T13:13:20.6691919Z ---------------------------------------------------------------------- 2022-08-17T13:13:20.6692253Z Ran 1 test in 5.432s 2022-08-17T13:13:20.6692420Z 2022-08-17T13:13:20.6692517Z OK 2022-08-17T13:13:20.6692653Z 2022-08-17T13:13:20.6692766Z Generating XML reports... 2022-08-17T13:13:20.6728886Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131315.xml 2022-08-17T13:13:22.4765356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:22.4766237Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:22.4771968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:22.4772748Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:22.6518905Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:13:22.6533969Z 2022-08-17T13:13:22.6534345Z Running tests... 2022-08-17T13:13:22.6534809Z ---------------------------------------------------------------------- 2022-08-17T13:13:22.6544954Z test_ddp_checkpointing_once_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T13:13:24.1756887Z DDP works as expected when layer is checkpointed only once. ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:13:24.1951599Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43570 2022-08-17T13:13:24.1958482Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43571 2022-08-17T13:13:25.6475295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:25.6484363Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:25.6484976Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:25.6485457Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:25.6745526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:25.6745983Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:25.6758429Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:25.6758909Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:25.8136257Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:13:25.8455332Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:13:27.1434099Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp01pc0bwk 2022-08-17T13:13:27.1435164Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp01pc0bwk/_remote_module_non_scriptable.py 2022-08-17T13:13:27.1435734Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_2inbo3y 2022-08-17T13:13:27.1437333Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_2inbo3y/_remote_module_non_scriptable.py 2022-08-17T13:13:27.5741958Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:13:27.5742574Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:13:27.5788121Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:13:27.5788667Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:13:27.5855404Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:27.5855912Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:27.6163753Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:27.6164244Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:27.6317548Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:13:27.6318622Z warnings.warn( 2022-08-17T13:13:27.6319709Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:13:27.6320431Z warnings.warn( 2022-08-17T13:13:27.6429413Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:27.6429931Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:27.6644583Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:27.6645292Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:27.6943571Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:27.6944057Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:27.7216712Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:27.7217222Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:28.2061738Z ok (5.552s) 2022-08-17T13:13:28.2061948Z 2022-08-17T13:13:28.2062348Z ---------------------------------------------------------------------- 2022-08-17T13:13:28.2062689Z Ran 1 test in 5.553s 2022-08-17T13:13:28.2062858Z 2022-08-17T13:13:28.2062940Z OK 2022-08-17T13:13:28.2063078Z 2022-08-17T13:13:28.2063217Z Generating XML reports... 2022-08-17T13:13:28.2102187Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131322.xml 2022-08-17T13:13:29.9579094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:29.9579588Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:29.9582073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:29.9582552Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:30.1324634Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:13:30.1339943Z 2022-08-17T13:13:30.1340085Z Running tests... 2022-08-17T13:13:30.1340523Z ---------------------------------------------------------------------- 2022-08-17T13:13:30.1350431Z test_ddp_checkpointing_once_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T13:13:31.6487814Z DDP works as expected when layer is checkpointed only once. ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:13:31.6683956Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43685 2022-08-17T13:13:31.6691258Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43686 2022-08-17T13:13:33.1230360Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:33.1230858Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:33.1240013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:33.1240479Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:33.1438439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:33.1438926Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:33.1450285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:33.1451032Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:33.2881277Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:13:33.3186913Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:13:34.6261198Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy5jkdgeg 2022-08-17T13:13:34.6261813Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy5jkdgeg/_remote_module_non_scriptable.py 2022-08-17T13:13:34.6447098Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp24fog7qx 2022-08-17T13:13:34.6449821Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp24fog7qx/_remote_module_non_scriptable.py 2022-08-17T13:13:35.0794163Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:13:35.0794973Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:13:35.0867604Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:13:35.0868169Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:13:35.0938615Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:35.0939119Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:35.1270261Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:35.1270804Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:35.1436019Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:13:35.1436870Z warnings.warn( 2022-08-17T13:13:35.1437918Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:13:35.1438636Z warnings.warn( 2022-08-17T13:13:35.1554669Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:35.1555189Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:35.1783573Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:35.1784097Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:35.2107463Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:35.2107997Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:35.2382226Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:35.2382747Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:35.6789739Z ok (5.545s) 2022-08-17T13:13:35.6790073Z 2022-08-17T13:13:35.6790831Z ---------------------------------------------------------------------- 2022-08-17T13:13:35.6791481Z Ran 1 test in 5.545s 2022-08-17T13:13:35.6791659Z 2022-08-17T13:13:35.6791762Z OK 2022-08-17T13:13:35.6791899Z 2022-08-17T13:13:35.6792321Z Generating XML reports... 2022-08-17T13:13:35.6828177Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131330.xml 2022-08-17T13:13:37.4732904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:37.4733421Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:37.4735215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:37.4736070Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:37.6483642Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:13:37.6498014Z 2022-08-17T13:13:37.6498401Z Running tests... 2022-08-17T13:13:37.6499347Z ---------------------------------------------------------------------- 2022-08-17T13:13:37.6507285Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T13:13:39.1523158Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:13:39.1717309Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43800 2022-08-17T13:13:39.1723537Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43801 2022-08-17T13:13:40.5722468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:40.5722971Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:40.5731818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:40.5732326Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:40.6021399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:40.6021856Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:40.6033158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:40.6033639Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:40.7379630Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:13:40.7728850Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:13:42.0727188Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqa5g_4pp 2022-08-17T13:13:42.0727815Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqa5g_4pp/_remote_module_non_scriptable.py 2022-08-17T13:13:42.1072784Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa3s_m5_s 2022-08-17T13:13:42.1074034Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa3s_m5_s/_remote_module_non_scriptable.py 2022-08-17T13:13:42.5494222Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:13:42.5494852Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:13:42.5562358Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:13:42.5562923Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:13:42.5704461Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:42.5704969Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:42.6032592Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:42.6033092Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:43.0821806Z ok (5.432s) 2022-08-17T13:13:43.0821989Z 2022-08-17T13:13:43.0822688Z ---------------------------------------------------------------------- 2022-08-17T13:13:43.0823033Z Ran 1 test in 5.432s 2022-08-17T13:13:43.0823202Z 2022-08-17T13:13:43.0823590Z OK 2022-08-17T13:13:43.0823742Z 2022-08-17T13:13:43.0823872Z Generating XML reports... 2022-08-17T13:13:43.0859579Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131337.xml 2022-08-17T13:13:44.8400780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:44.8402222Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:44.8403474Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:44.8404377Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:45.0186793Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:13:45.0203556Z 2022-08-17T13:13:45.0204093Z Running tests... 2022-08-17T13:13:45.0204974Z ---------------------------------------------------------------------- 2022-08-17T13:13:45.0213954Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T13:13:46.5393539Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:13:46.5593054Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43915 2022-08-17T13:13:46.5599529Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43916 2022-08-17T13:13:47.9976926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:47.9977417Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:47.9986603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:47.9987086Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:48.0073616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:48.0074058Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:48.0085089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:48.0085579Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:48.1650937Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:13:48.1774781Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:13:49.4868022Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprixrmh1r 2022-08-17T13:13:49.4869150Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprixrmh1r/_remote_module_non_scriptable.py 2022-08-17T13:13:49.5088188Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphf9mj2qh 2022-08-17T13:13:49.5090628Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphf9mj2qh/_remote_module_non_scriptable.py 2022-08-17T13:13:49.9409097Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:13:49.9409729Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:13:49.9520202Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:13:49.9520782Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:13:49.9662386Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:49.9662940Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:49.9998796Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:49.9999314Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:50.4717104Z ok (5.451s) 2022-08-17T13:13:50.4717313Z 2022-08-17T13:13:50.4717722Z ---------------------------------------------------------------------- 2022-08-17T13:13:50.4718383Z Ran 1 test in 5.451s 2022-08-17T13:13:50.4718554Z 2022-08-17T13:13:50.4718647Z OK 2022-08-17T13:13:50.4718782Z 2022-08-17T13:13:50.4718919Z Generating XML reports... 2022-08-17T13:13:50.4756941Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131345.xml 2022-08-17T13:13:52.2781781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:52.2782303Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:52.2784476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:52.2784968Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:52.4540756Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:13:52.4555617Z 2022-08-17T13:13:52.4555861Z Running tests... 2022-08-17T13:13:52.4556281Z ---------------------------------------------------------------------- 2022-08-17T13:13:52.4566261Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T13:13:53.9814635Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:13:54.0010268Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44030 2022-08-17T13:13:54.0016190Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44031 2022-08-17T13:13:55.4026632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:55.4027182Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:55.4036099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:55.4036645Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:55.4312957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:55.4313569Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:55.4323818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:55.4324353Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:55.5671458Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:13:55.5961372Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:13:56.8755574Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmyq2ksl1 2022-08-17T13:13:56.8756283Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmyq2ksl1/_remote_module_non_scriptable.py 2022-08-17T13:13:56.8926858Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6c9mctzo 2022-08-17T13:13:56.8929977Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6c9mctzo/_remote_module_non_scriptable.py 2022-08-17T13:13:57.3243962Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:13:57.3244560Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:13:57.3290508Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:13:57.3291076Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:13:57.3367573Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:57.3368392Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:57.3623750Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:13:57.3625642Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:13:57.4013158Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:57.4013668Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:13:57.9115825Z ok (5.456s) 2022-08-17T13:13:57.9116067Z 2022-08-17T13:13:57.9116478Z ---------------------------------------------------------------------- 2022-08-17T13:13:57.9116820Z Ran 1 test in 5.456s 2022-08-17T13:13:57.9116986Z 2022-08-17T13:13:57.9117080Z OK 2022-08-17T13:13:57.9117198Z 2022-08-17T13:13:57.9117332Z Generating XML reports... 2022-08-17T13:13:57.9154148Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131352.xml 2022-08-17T13:13:59.7049938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:13:59.7050446Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:13:59.7052793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:13:59.7053285Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:13:59.8806262Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:13:59.8821495Z 2022-08-17T13:13:59.8821727Z Running tests... 2022-08-17T13:13:59.8822140Z ---------------------------------------------------------------------- 2022-08-17T13:13:59.8832483Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T13:14:01.4090210Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:14:01.4284931Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44145 2022-08-17T13:14:01.4291260Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44146 2022-08-17T13:14:02.8608424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:02.8608950Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:02.8617338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:02.8617827Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:02.8738539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:02.8738997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:02.8749862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:02.8750516Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:03.0256991Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:14:03.0391757Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:14:04.3164211Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqg_k3unv 2022-08-17T13:14:04.3165011Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqg_k3unv/_remote_module_non_scriptable.py 2022-08-17T13:14:04.3290943Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp31rca4ht 2022-08-17T13:14:04.3293773Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp31rca4ht/_remote_module_non_scriptable.py 2022-08-17T13:14:04.7619027Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:14:04.7619658Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:14:04.7630994Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:14:04.7631548Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:14:05.2392199Z ok (5.357s) 2022-08-17T13:14:05.2392590Z 2022-08-17T13:14:05.2393321Z ---------------------------------------------------------------------- 2022-08-17T13:14:05.2393738Z Ran 1 test in 5.357s 2022-08-17T13:14:05.2393908Z 2022-08-17T13:14:05.2393984Z OK 2022-08-17T13:14:05.2394120Z 2022-08-17T13:14:05.2394255Z Generating XML reports... 2022-08-17T13:14:05.2430365Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131359.xml 2022-08-17T13:14:07.0050898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:07.0051432Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:07.0053948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:07.0054695Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:07.1739037Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:14:07.1754273Z 2022-08-17T13:14:07.1754606Z Running tests... 2022-08-17T13:14:07.1755528Z ---------------------------------------------------------------------- 2022-08-17T13:14:07.1762683Z test_ddp_checkpointing_twice_weight_sharing (__main__.DistributedDataParallelTest) 2022-08-17T13:14:08.6504551Z Checkpointing should work with static graph in the case of checkpointing ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:14:08.6693874Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44260 2022-08-17T13:14:08.6699757Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44261 2022-08-17T13:14:10.0707343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:10.0707954Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:10.0716559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:10.0717052Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:10.0881542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:10.0881993Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:10.0893075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:10.0893792Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:10.2382430Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:14:10.2582467Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:14:11.5578766Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbghsny3e 2022-08-17T13:14:11.5580046Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbghsny3e/_remote_module_non_scriptable.py 2022-08-17T13:14:11.5780095Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkckhiiby 2022-08-17T13:14:11.5782919Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkckhiiby/_remote_module_non_scriptable.py 2022-08-17T13:14:12.0167511Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:14:12.0168136Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:14:12.0245736Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:14:12.0246311Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:14:12.0378038Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:12.0378545Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:12.0684273Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:12.0684766Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:12.5795081Z ok (5.404s) 2022-08-17T13:14:12.5795258Z 2022-08-17T13:14:12.5796234Z ---------------------------------------------------------------------- 2022-08-17T13:14:12.5797010Z Ran 1 test in 5.404s 2022-08-17T13:14:12.5797284Z 2022-08-17T13:14:12.5797382Z OK 2022-08-17T13:14:12.5797525Z 2022-08-17T13:14:12.5797670Z Generating XML reports... 2022-08-17T13:14:12.5833374Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131407.xml 2022-08-17T13:14:14.3612005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:14.3612508Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:14.3615008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:14.3615501Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:14.5376899Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:14:14.5392593Z 2022-08-17T13:14:14.5392817Z Running tests... 2022-08-17T13:14:14.5393252Z ---------------------------------------------------------------------- 2022-08-17T13:14:14.5403662Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T13:14:16.0522601Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:14:16.0711138Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44375 2022-08-17T13:14:16.0717221Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44376 2022-08-17T13:14:17.4708736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:17.4709222Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:17.4718026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:17.4718858Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:17.4958433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:17.4958885Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:17.4969863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:17.4970345Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:17.6382532Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:14:17.6687562Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:14:18.9601347Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv4fu18_k 2022-08-17T13:14:18.9602737Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv4fu18_k/_remote_module_non_scriptable.py 2022-08-17T13:14:18.9830065Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpaph86aic 2022-08-17T13:14:18.9832624Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpaph86aic/_remote_module_non_scriptable.py 2022-08-17T13:14:19.4163716Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:14:19.4164305Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:14:19.4166080Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:14:19.4306761Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:14:19.4307324Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:14:19.4310909Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:14:19.4594829Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:14:19.4595599Z warnings.warn( 2022-08-17T13:14:19.4596659Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:14:19.4597371Z warnings.warn( 2022-08-17T13:14:19.4709525Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:19.4710169Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:19.5257516Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:19.5258005Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:19.9817039Z ok (5.442s) 2022-08-17T13:14:19.9817356Z 2022-08-17T13:14:19.9817780Z ---------------------------------------------------------------------- 2022-08-17T13:14:19.9818102Z Ran 1 test in 5.442s 2022-08-17T13:14:19.9818265Z 2022-08-17T13:14:19.9818358Z OK 2022-08-17T13:14:19.9818493Z 2022-08-17T13:14:19.9818626Z Generating XML reports... 2022-08-17T13:14:19.9855439Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131414.xml 2022-08-17T13:14:21.7620889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:21.7621607Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:21.7624133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:21.7624614Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:21.9374609Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:14:21.9390310Z 2022-08-17T13:14:21.9390674Z Running tests... 2022-08-17T13:14:21.9391121Z ---------------------------------------------------------------------- 2022-08-17T13:14:21.9400974Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T13:14:23.4617202Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:14:23.4812550Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44490 2022-08-17T13:14:23.4818490Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44491 2022-08-17T13:14:24.9042743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:24.9043287Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:24.9053070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:24.9053570Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:24.9532871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:24.9533313Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:24.9544585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:24.9545067Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:25.0761634Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:14:25.1242255Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:14:26.4383557Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgd96lueq 2022-08-17T13:14:26.4384868Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgd96lueq/_remote_module_non_scriptable.py 2022-08-17T13:14:26.4399049Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo1qdzpc5 2022-08-17T13:14:26.4401997Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo1qdzpc5/_remote_module_non_scriptable.py 2022-08-17T13:14:26.8876875Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:14:26.8877770Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:14:26.8884135Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:14:26.8884689Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:14:26.9027084Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:14:26.9027931Z warnings.warn( 2022-08-17T13:14:26.9028990Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:14:26.9029702Z warnings.warn( 2022-08-17T13:14:26.9158384Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:26.9158876Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:26.9576564Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:26.9577066Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:27.4918551Z ok (5.552s) 2022-08-17T13:14:27.4918754Z 2022-08-17T13:14:27.4919144Z ---------------------------------------------------------------------- 2022-08-17T13:14:27.4919485Z Ran 1 test in 5.553s 2022-08-17T13:14:27.4919676Z 2022-08-17T13:14:27.4919771Z OK 2022-08-17T13:14:27.4919907Z 2022-08-17T13:14:27.4920045Z Generating XML reports... 2022-08-17T13:14:27.4955967Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131421.xml 2022-08-17T13:14:29.2912563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:29.2913057Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:29.2915512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:29.2915980Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:29.4679659Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:14:29.4694753Z 2022-08-17T13:14:29.4694895Z Running tests... 2022-08-17T13:14:29.4695525Z ---------------------------------------------------------------------- 2022-08-17T13:14:29.4707577Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-08-17T13:14:30.9785925Z Test that checkpointing with weight sharing works. ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:14:30.9981583Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44605 2022-08-17T13:14:30.9988046Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44606 2022-08-17T13:14:32.3812164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:32.3812674Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:32.3821349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:32.3821829Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:32.4189516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:32.4189983Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:32.4200931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:32.4201399Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:32.5455397Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:14:32.5909850Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:14:33.9064614Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0l61b83q 2022-08-17T13:14:33.9065433Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0l61b83q/_remote_module_non_scriptable.py 2022-08-17T13:14:33.9198916Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp627q6cjg 2022-08-17T13:14:33.9201765Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp627q6cjg/_remote_module_non_scriptable.py 2022-08-17T13:14:34.3553131Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:14:34.3553721Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:14:34.3622778Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:14:34.3623336Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:14:34.3691278Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:34.3691764Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:34.4063437Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:34.4063939Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:34.9088871Z ok (5.439s) 2022-08-17T13:14:34.9089083Z 2022-08-17T13:14:34.9089464Z ---------------------------------------------------------------------- 2022-08-17T13:14:34.9089787Z Ran 1 test in 5.439s 2022-08-17T13:14:34.9089955Z 2022-08-17T13:14:34.9090053Z OK 2022-08-17T13:14:34.9090187Z 2022-08-17T13:14:34.9090321Z Generating XML reports... 2022-08-17T13:14:34.9126401Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131429.xml 2022-08-17T13:14:36.7002320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:36.7002825Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:36.7005395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:36.7005935Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:36.8768027Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:14:36.8782658Z 2022-08-17T13:14:36.8783042Z Running tests... 2022-08-17T13:14:36.8783678Z ---------------------------------------------------------------------- 2022-08-17T13:14:36.8795217Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-08-17T13:14:38.3852157Z Test that checkpointing with weight sharing works. ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:14:38.4049169Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44720 2022-08-17T13:14:38.4055606Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44721 2022-08-17T13:14:39.7592376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:39.7593218Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:39.7601914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:39.7602381Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:39.8246963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:39.8247430Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:39.8258311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:39.8258765Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:39.9238852Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:14:39.9949164Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:14:41.2932710Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv0vniua2 2022-08-17T13:14:41.2933553Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv0vniua2/_remote_module_non_scriptable.py 2022-08-17T13:14:41.3230804Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiifmh0m7 2022-08-17T13:14:41.3233280Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiifmh0m7/_remote_module_non_scriptable.py 2022-08-17T13:14:41.7550153Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:14:41.7550744Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:14:41.7691211Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:14:41.7691797Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:14:41.7765450Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:41.7765938Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:41.8074465Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:41.8074980Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:41.8287872Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:41.8288357Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:41.8591660Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:41.8592180Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:14:42.3170119Z ok (5.438s) 2022-08-17T13:14:42.3170324Z 2022-08-17T13:14:42.3171003Z ---------------------------------------------------------------------- 2022-08-17T13:14:42.3171342Z Ran 1 test in 5.439s 2022-08-17T13:14:42.3171509Z 2022-08-17T13:14:42.3171602Z OK 2022-08-17T13:14:42.3171738Z 2022-08-17T13:14:42.3171874Z Generating XML reports... 2022-08-17T13:14:42.3207151Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131436.xml 2022-08-17T13:14:44.0790900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:44.0791404Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:44.0793903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:44.0794660Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:44.2473792Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:14:44.2487998Z 2022-08-17T13:14:44.2488471Z Running tests... 2022-08-17T13:14:44.2488963Z ---------------------------------------------------------------------- 2022-08-17T13:14:44.2496796Z test_ddp_comm_hook_future_passing_cpu (__main__.DistributedDataParallelTest) 2022-08-17T13:14:45.7066632Z This unit test verifies whether the Future object is passed properly. ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:14:45.7255300Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44835 2022-08-17T13:14:45.7261104Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44836 2022-08-17T13:14:47.1791341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:47.1791838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:47.1800913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:47.1801405Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:47.1806884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:47.1807331Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:47.1818740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:47.1819216Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:47.3472180Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:14:47.3545133Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:14:47.3858379Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpguxbjzfa 2022-08-17T13:14:47.3861275Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpguxbjzfa/_remote_module_non_scriptable.py 2022-08-17T13:14:47.3861823Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq9ge5iuo 2022-08-17T13:14:47.3864799Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq9ge5iuo/_remote_module_non_scriptable.py 2022-08-17T13:14:47.7316865Z ok (3.483s) 2022-08-17T13:14:47.7317082Z 2022-08-17T13:14:47.7317496Z ---------------------------------------------------------------------- 2022-08-17T13:14:47.7317828Z Ran 1 test in 3.483s 2022-08-17T13:14:47.7317999Z 2022-08-17T13:14:47.7318104Z OK 2022-08-17T13:14:47.7318241Z 2022-08-17T13:14:47.7318381Z Generating XML reports... 2022-08-17T13:14:47.7354373Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131444.xml 2022-08-17T13:14:49.5022301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:49.5023104Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:49.5025014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:49.5025503Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:49.6764535Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:14:49.6779218Z 2022-08-17T13:14:49.6779714Z Running tests... 2022-08-17T13:14:49.6780227Z ---------------------------------------------------------------------- 2022-08-17T13:14:49.6788549Z test_ddp_comm_hook_future_passing_gpu_gloo (__main__.DistributedDataParallelTest) 2022-08-17T13:14:51.1885015Z This unit test verifies whether the Future object is passed properly using gloo backend. ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:14:51.2080923Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44948 2022-08-17T13:14:51.2086964Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44949 2022-08-17T13:14:52.6189256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:52.6189829Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:52.6198588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:52.6199198Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:52.6350557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:52.6351068Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:52.6362312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:52.6362951Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:52.7838510Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:14:52.8056055Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:14:54.1072182Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx4ugp7_2 2022-08-17T13:14:54.1072855Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx4ugp7_2/_remote_module_non_scriptable.py 2022-08-17T13:14:54.1385290Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmponlthtb3 2022-08-17T13:14:54.1386951Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmponlthtb3/_remote_module_non_scriptable.py 2022-08-17T13:14:54.1569663Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:14:54.1570474Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:14:54.1571193Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:14:54.1571742Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:14:54.6174630Z ok (4.939s) 2022-08-17T13:14:54.6174982Z 2022-08-17T13:14:54.6175615Z ---------------------------------------------------------------------- 2022-08-17T13:14:54.6176200Z Ran 1 test in 4.939s 2022-08-17T13:14:54.6176481Z 2022-08-17T13:14:54.6176643Z OK 2022-08-17T13:14:54.6176888Z 2022-08-17T13:14:54.6177096Z Generating XML reports... 2022-08-17T13:14:54.6214024Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131449.xml 2022-08-17T13:14:56.3702846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:56.3703918Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:56.3705991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:56.3706482Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:56.5406479Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:14:56.5421519Z 2022-08-17T13:14:56.5421926Z Running tests... 2022-08-17T13:14:56.5422410Z ---------------------------------------------------------------------- 2022-08-17T13:14:56.5432778Z test_ddp_comm_hook_register_just_once (__main__.DistributedDataParallelTest) 2022-08-17T13:14:58.0174057Z DDP communication hook can only be registered once. This test validates whether ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:14:58.0364016Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45063 2022-08-17T13:14:58.0370413Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45064 2022-08-17T13:14:59.4834541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:59.4835053Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:59.4843714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:59.4844187Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:59.5301243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:14:59.5301710Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:14:59.5313278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:14:59.5313752Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:14:59.6501486Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:14:59.7034202Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:14:59.7320193Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7izz5itu 2022-08-17T13:14:59.7322448Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7izz5itu/_remote_module_non_scriptable.py 2022-08-17T13:14:59.7323476Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9mtrnhll 2022-08-17T13:14:59.7326963Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9mtrnhll/_remote_module_non_scriptable.py 2022-08-17T13:15:00.0426907Z ok (3.500s) 2022-08-17T13:15:00.0427130Z 2022-08-17T13:15:00.0427532Z ---------------------------------------------------------------------- 2022-08-17T13:15:00.0427857Z Ran 1 test in 3.500s 2022-08-17T13:15:00.0428022Z 2022-08-17T13:15:00.0428136Z OK 2022-08-17T13:15:00.0428273Z 2022-08-17T13:15:00.0428408Z Generating XML reports... 2022-08-17T13:15:00.0463142Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131456.xml 2022-08-17T13:15:01.8241209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:01.8241690Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:01.8243814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:01.8244295Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:02.0031083Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:15:02.0045124Z 2022-08-17T13:15:02.0045399Z Running tests... 2022-08-17T13:15:02.0046163Z ---------------------------------------------------------------------- 2022-08-17T13:15:02.0057087Z test_ddp_comm_hook_sparse_gradients (__main__.DistributedDataParallelTest) 2022-08-17T13:15:03.5206757Z Runs "test_sparse_gradients" unit test with DDP communication hook. We define a ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:15:03.5395518Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45172 2022-08-17T13:15:03.5401998Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45173 2022-08-17T13:15:04.9292105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:04.9292611Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:04.9301573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:04.9302064Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:04.9847432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:04.9847898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:04.9858304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:04.9858786Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:05.0949019Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:15:05.1491619Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:15:05.1814737Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplep2n95h 2022-08-17T13:15:05.1815459Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuw4iub2b 2022-08-17T13:15:05.1817266Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplep2n95h/_remote_module_non_scriptable.py 2022-08-17T13:15:05.1817838Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuw4iub2b/_remote_module_non_scriptable.py 2022-08-17T13:15:05.5457327Z ok (3.541s) 2022-08-17T13:15:05.5457695Z 2022-08-17T13:15:05.5458424Z ---------------------------------------------------------------------- 2022-08-17T13:15:05.5458890Z Ran 1 test in 3.541s 2022-08-17T13:15:05.5459058Z 2022-08-17T13:15:05.5459158Z OK 2022-08-17T13:15:05.5459298Z 2022-08-17T13:15:05.5459869Z Generating XML reports... 2022-08-17T13:15:05.5495436Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131502.xml 2022-08-17T13:15:07.3319540Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:07.3320041Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:07.3322453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:07.3322942Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:07.5089229Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:15:07.5103655Z 2022-08-17T13:15:07.5104109Z Running tests... 2022-08-17T13:15:07.5104593Z ---------------------------------------------------------------------- 2022-08-17T13:15:07.5115077Z test_ddp_invalid_comm_hook_init (__main__.DistributedDataParallelTest) 2022-08-17T13:15:09.0437723Z This unit test makes sure that register_comm_hook properly checks the format ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:15:09.0633593Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45315 2022-08-17T13:15:09.0640109Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45316 2022-08-17T13:15:10.4625173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:10.4625747Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:10.4634638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:10.4635129Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:10.4750759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:10.4751247Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:10.4761794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:10.4762484Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:10.6284760Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:15:10.6387445Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:15:10.6699266Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzz2m5g4_ 2022-08-17T13:15:10.6699815Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplt9bs7aq 2022-08-17T13:15:10.6701847Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzz2m5g4_/_remote_module_non_scriptable.py 2022-08-17T13:15:10.6702406Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplt9bs7aq/_remote_module_non_scriptable.py 2022-08-17T13:15:10.9693890Z ok (3.459s) 2022-08-17T13:15:10.9694108Z 2022-08-17T13:15:10.9694514Z ---------------------------------------------------------------------- 2022-08-17T13:15:10.9694857Z Ran 1 test in 3.459s 2022-08-17T13:15:10.9695029Z 2022-08-17T13:15:10.9695125Z OK 2022-08-17T13:15:10.9695265Z 2022-08-17T13:15:10.9695405Z Generating XML reports... 2022-08-17T13:15:10.9733115Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131507.xml 2022-08-17T13:15:12.7352865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:12.7353365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:12.7355572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:12.7356061Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:12.9106517Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:15:12.9121701Z 2022-08-17T13:15:12.9122136Z Running tests... 2022-08-17T13:15:12.9122616Z ---------------------------------------------------------------------- 2022-08-17T13:15:12.9134386Z test_ddp_invalid_comm_hook_return_type (__main__.DistributedDataParallelTest) 2022-08-17T13:15:14.4601573Z This test checks whether return annotation checked properly if defined. It also ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:15:14.4795101Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45424 2022-08-17T13:15:14.4801294Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45425 2022-08-17T13:15:15.8693340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:15.8693841Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:15.8702344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:15.8702834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:15.8986516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:15.8987283Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:15.8997975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:15.8998460Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:16.0340266Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:15:16.0686639Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:15:16.0997558Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbydfxum5 2022-08-17T13:15:16.1000046Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbydfxum5/_remote_module_non_scriptable.py 2022-08-17T13:15:16.1000773Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgr8gi11b 2022-08-17T13:15:16.1002825Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgr8gi11b/_remote_module_non_scriptable.py 2022-08-17T13:15:16.4857167Z ok (3.573s) 2022-08-17T13:15:16.4857496Z 2022-08-17T13:15:16.4857922Z ---------------------------------------------------------------------- 2022-08-17T13:15:16.4858262Z Ran 1 test in 3.573s 2022-08-17T13:15:16.4858430Z 2022-08-17T13:15:16.4858506Z OK 2022-08-17T13:15:16.4858642Z 2022-08-17T13:15:16.4858779Z Generating XML reports... 2022-08-17T13:15:16.4894173Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131512.xml 2022-08-17T13:15:18.2457352Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:18.2457852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:18.2459426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:18.2459912Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:18.4204569Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:15:18.4219053Z 2022-08-17T13:15:18.4219271Z Running tests... 2022-08-17T13:15:18.4219722Z ---------------------------------------------------------------------- 2022-08-17T13:15:18.4236542Z test_find_unused_parameters_when_unused_parameters_empty (__main__.DistributedDataParallelTest) 2022-08-17T13:15:19.9442220Z An empty unused_parameters array does not imply find_unused_parameters = ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:15:19.9635762Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45537 2022-08-17T13:15:19.9641992Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45538 2022-08-17T13:15:21.3978034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:21.3987960Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:21.3988588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:21.3989074Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:21.4464126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:21.4464610Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:21.4475785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:21.4476280Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:21.5689713Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:15:21.6206535Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:15:21.6516880Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplzyhk5_m 2022-08-17T13:15:21.6517982Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3ljxbn1r 2022-08-17T13:15:21.6518859Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplzyhk5_m/_remote_module_non_scriptable.py 2022-08-17T13:15:21.6520585Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3ljxbn1r/_remote_module_non_scriptable.py 2022-08-17T13:15:21.6680023Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:15:22.9989909Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:15:22.9990541Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:15:22.9991262Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:15:22.9992155Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:15:23.4745692Z ok (5.052s) 2022-08-17T13:15:23.4745902Z 2022-08-17T13:15:23.4746291Z ---------------------------------------------------------------------- 2022-08-17T13:15:23.4746673Z Ran 1 test in 5.053s 2022-08-17T13:15:23.4746842Z 2022-08-17T13:15:23.4746918Z OK 2022-08-17T13:15:23.4747055Z 2022-08-17T13:15:23.4747190Z Generating XML reports... 2022-08-17T13:15:23.4783921Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131518.xml 2022-08-17T13:15:25.2726434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:25.2726939Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:25.2729299Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:25.2729767Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:25.4492620Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:15:25.4508229Z 2022-08-17T13:15:25.4508447Z Running tests... 2022-08-17T13:15:25.4508905Z ---------------------------------------------------------------------- 2022-08-17T13:15:26.9733733Z test_global_local_unused_params_grad (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:15:26.9930979Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45652 2022-08-17T13:15:26.9936916Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45653 2022-08-17T13:15:28.3872831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:28.3873372Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:28.3882288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:28.3882771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:28.4216449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:28.4216939Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:28.4228557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:28.4229061Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:28.5538640Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:15:28.5942303Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:15:28.6256316Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjz4p6mcd 2022-08-17T13:15:28.6259129Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjz4p6mcd/_remote_module_non_scriptable.py 2022-08-17T13:15:28.6259680Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgp9kd7jl 2022-08-17T13:15:28.6262716Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgp9kd7jl/_remote_module_non_scriptable.py 2022-08-17T13:15:29.9466720Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:15:29.9467335Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:15:29.9468046Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:15:29.9468589Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:15:30.4024270Z ok (4.951s) 2022-08-17T13:15:30.4024571Z 2022-08-17T13:15:30.4025130Z ---------------------------------------------------------------------- 2022-08-17T13:15:30.4025475Z Ran 1 test in 4.952s 2022-08-17T13:15:30.4025645Z 2022-08-17T13:15:30.4025740Z OK 2022-08-17T13:15:30.4025877Z 2022-08-17T13:15:30.4026013Z Generating XML reports... 2022-08-17T13:15:30.4061825Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131525.xml 2022-08-17T13:15:32.1689892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:32.1690433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:32.1692707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:32.1693197Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:32.3431453Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:15:32.3445655Z 2022-08-17T13:15:32.3446105Z Running tests... 2022-08-17T13:15:32.3446608Z ---------------------------------------------------------------------- 2022-08-17T13:15:33.8487412Z test_global_local_unused_params_grad_with_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:15:33.8682803Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45767 2022-08-17T13:15:33.8688807Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45768 2022-08-17T13:15:35.2630939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:35.2631918Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:35.2634170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:35.2635101Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:35.2643030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:35.2644036Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:35.2645221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:35.2646545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:35.4373700Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:15:35.4374759Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:15:35.4589384Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq5_tylkz 2022-08-17T13:15:35.4590366Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjcvxyf8n 2022-08-17T13:15:35.4591574Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq5_tylkz/_remote_module_non_scriptable.py 2022-08-17T13:15:35.4592612Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjcvxyf8n/_remote_module_non_scriptable.py 2022-08-17T13:15:36.7724178Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:15:36.7724793Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:15:36.7725498Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:15:36.7726049Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:15:37.1773863Z ok (4.832s) 2022-08-17T13:15:37.1774080Z 2022-08-17T13:15:37.1774451Z ---------------------------------------------------------------------- 2022-08-17T13:15:37.1774797Z Ran 1 test in 4.833s 2022-08-17T13:15:37.1774964Z 2022-08-17T13:15:37.1775059Z OK 2022-08-17T13:15:37.1775195Z 2022-08-17T13:15:37.1775330Z Generating XML reports... 2022-08-17T13:15:37.1809579Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131532.xml 2022-08-17T13:15:38.9453169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:38.9453659Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:38.9456304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:38.9456797Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:39.1195514Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:15:39.1210698Z 2022-08-17T13:15:39.1210930Z Running tests... 2022-08-17T13:15:39.1211364Z ---------------------------------------------------------------------- 2022-08-17T13:15:40.6192198Z test_global_local_unused_params_grad_with_static_graph (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:15:40.6443852Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45882 2022-08-17T13:15:40.6449861Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45883 2022-08-17T13:15:42.0784144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:42.0784822Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:42.0793449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:42.0793946Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:42.0951747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:42.0952193Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:42.0963355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:42.0964114Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:42.2441984Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:15:42.2656631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:15:42.2971719Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp87cx3g9a 2022-08-17T13:15:42.2974116Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp87cx3g9a/_remote_module_non_scriptable.py 2022-08-17T13:15:42.2975832Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpizdxlhtk 2022-08-17T13:15:42.2978670Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpizdxlhtk/_remote_module_non_scriptable.py 2022-08-17T13:15:42.3136659Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:15:42.3137419Z warnings.warn( 2022-08-17T13:15:42.3138465Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:15:42.3139188Z warnings.warn( 2022-08-17T13:15:43.6290634Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:15:43.6291258Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:15:43.6291981Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:15:43.6292506Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:15:44.0536587Z ok (4.932s) 2022-08-17T13:15:44.0537027Z 2022-08-17T13:15:44.0537751Z ---------------------------------------------------------------------- 2022-08-17T13:15:44.0538139Z Ran 1 test in 4.932s 2022-08-17T13:15:44.0538287Z 2022-08-17T13:15:44.0538390Z OK 2022-08-17T13:15:44.0538528Z 2022-08-17T13:15:44.0538667Z Generating XML reports... 2022-08-17T13:15:44.0575401Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131539.xml 2022-08-17T13:15:45.7966203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:45.7967144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:45.7970377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:45.7971343Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:45.9723012Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:15:45.9738886Z 2022-08-17T13:15:45.9739359Z Running tests... 2022-08-17T13:15:45.9739852Z ---------------------------------------------------------------------- 2022-08-17T13:15:47.4710668Z test_gloo_backend_1gpu_module_device_ids_integer_list (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:15:47.4896596Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45997 2022-08-17T13:15:47.4902983Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45998 2022-08-17T13:15:48.8929911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:48.8930427Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:48.8940587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:48.8941223Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:48.9247330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:48.9247965Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:48.9259256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:48.9260153Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:49.0578918Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:15:49.0960044Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:15:50.4001395Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcverfdny 2022-08-17T13:15:50.4002478Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcverfdny/_remote_module_non_scriptable.py 2022-08-17T13:15:50.4432507Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpljw4tink 2022-08-17T13:15:50.4434872Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpljw4tink/_remote_module_non_scriptable.py 2022-08-17T13:15:50.8641077Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:15:50.8642211Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:15:50.8731431Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:15:50.8731991Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:15:50.8797213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:15:50.8798174Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:15:51.3000777Z ok (5.326s) 2022-08-17T13:15:51.3000968Z 2022-08-17T13:15:51.3001367Z ---------------------------------------------------------------------- 2022-08-17T13:15:51.3002041Z Ran 1 test in 5.326s 2022-08-17T13:15:51.3002238Z 2022-08-17T13:15:51.3002336Z OK 2022-08-17T13:15:51.3002475Z 2022-08-17T13:15:51.3002619Z Generating XML reports... 2022-08-17T13:15:51.3038292Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131545.xml 2022-08-17T13:15:53.0902272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:53.0902774Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:53.0905728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:53.0906219Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:53.2665849Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:15:53.2681688Z 2022-08-17T13:15:53.2681935Z Running tests... 2022-08-17T13:15:53.2682353Z ---------------------------------------------------------------------- 2022-08-17T13:15:54.7759938Z test_gloo_backend_1gpu_module_device_ids_torch_device_list (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:15:54.7955781Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46112 2022-08-17T13:15:54.7962430Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46113 2022-08-17T13:15:56.2188556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:56.2189076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:56.2199362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:56.2199855Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:56.2373516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:15:56.2373986Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:15:56.2384421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:15:56.2384906Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:15:56.3907925Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:15:56.4024546Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:15:57.7103466Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6p4zmncs 2022-08-17T13:15:57.7104350Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6p4zmncs/_remote_module_non_scriptable.py 2022-08-17T13:15:57.7279534Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgw1ilcco 2022-08-17T13:15:57.7282405Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgw1ilcco/_remote_module_non_scriptable.py 2022-08-17T13:15:58.1512326Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:15:58.1512962Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:15:58.1545103Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:15:58.1545674Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:15:58.1610472Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:15:58.1610991Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:15:58.6059885Z ok (5.338s) 2022-08-17T13:15:58.6060062Z 2022-08-17T13:15:58.6060462Z ---------------------------------------------------------------------- 2022-08-17T13:15:58.6060794Z Ran 1 test in 5.338s 2022-08-17T13:15:58.6060960Z 2022-08-17T13:15:58.6061058Z OK 2022-08-17T13:15:58.6061194Z 2022-08-17T13:15:58.6061322Z Generating XML reports... 2022-08-17T13:15:58.6096450Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131553.xml 2022-08-17T13:16:00.3927811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:00.3928324Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:00.3931089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:00.3931569Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:00.5683283Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:16:00.5698543Z 2022-08-17T13:16:00.5698744Z Running tests... 2022-08-17T13:16:00.5699188Z ---------------------------------------------------------------------- 2022-08-17T13:16:02.0940216Z test_gloo_backend_2gpu_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:16:02.1138731Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46227 2022-08-17T13:16:02.1145079Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46228 2022-08-17T13:16:03.5353077Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:03.5353562Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:03.5362300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:03.5362790Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:03.5718429Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:03.5719118Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:03.5729344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:03.5729826Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:03.7000605Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:16:03.7416541Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:16:04.0198490Z skip: Need at least 4 CUDA devices (3.450s) 2022-08-17T13:16:04.0198737Z 2022-08-17T13:16:04.0199108Z ---------------------------------------------------------------------- 2022-08-17T13:16:04.0199432Z Ran 1 test in 3.450s 2022-08-17T13:16:04.0199596Z 2022-08-17T13:16:04.0199710Z OK (skipped=1) 2022-08-17T13:16:04.0199868Z 2022-08-17T13:16:04.0199998Z Generating XML reports... 2022-08-17T13:16:04.0237649Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131600.xml 2022-08-17T13:16:05.8131401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:05.8131927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:05.8134644Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:05.8135136Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:05.9885666Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:16:05.9901191Z 2022-08-17T13:16:05.9901438Z Running tests... 2022-08-17T13:16:05.9901866Z ---------------------------------------------------------------------- 2022-08-17T13:16:07.4954415Z test_gloo_backend_4gpu_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:16:07.5140466Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46330 2022-08-17T13:16:07.5147300Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46331 2022-08-17T13:16:08.9405752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:08.9406245Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:08.9415167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:08.9415653Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:08.9813297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:08.9813765Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:08.9825203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:08.9825684Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:09.1052159Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:16:09.1514152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:16:09.4201231Z skip: Need at least 8 CUDA devices (3.430s) 2022-08-17T13:16:09.4201463Z 2022-08-17T13:16:09.4201843Z ---------------------------------------------------------------------- 2022-08-17T13:16:09.4202186Z Ran 1 test in 3.430s 2022-08-17T13:16:09.4202354Z 2022-08-17T13:16:09.4202471Z OK (skipped=1) 2022-08-17T13:16:09.4202610Z 2022-08-17T13:16:09.4202742Z Generating XML reports... 2022-08-17T13:16:09.4239408Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131605.xml 2022-08-17T13:16:11.1790144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:11.1790648Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:11.1793370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:11.1793859Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:11.3551692Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:16:11.3566973Z 2022-08-17T13:16:11.3567115Z Running tests... 2022-08-17T13:16:11.3567536Z ---------------------------------------------------------------------- 2022-08-17T13:16:12.8686834Z test_gloo_backend_cpu_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:16:12.8882919Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46433 2022-08-17T13:16:12.8889682Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46434 2022-08-17T13:16:14.3319551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:14.3320065Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:14.3329183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:14.3329669Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:14.3701743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:14.3702220Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:14.3713842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:14.3714340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:14.4982489Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:16:14.5400788Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:16:14.5725493Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_zzzo0bq 2022-08-17T13:16:14.5726622Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpeksxz3rq 2022-08-17T13:16:14.5728144Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_zzzo0bq/_remote_module_non_scriptable.py 2022-08-17T13:16:14.5730231Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpeksxz3rq/_remote_module_non_scriptable.py 2022-08-17T13:16:14.5939821Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:16:14.5940312Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:16:14.9946847Z ok (3.638s) 2022-08-17T13:16:14.9947047Z 2022-08-17T13:16:14.9947442Z ---------------------------------------------------------------------- 2022-08-17T13:16:14.9947797Z Ran 1 test in 3.638s 2022-08-17T13:16:14.9948233Z 2022-08-17T13:16:14.9948339Z OK 2022-08-17T13:16:14.9948481Z 2022-08-17T13:16:14.9948618Z Generating XML reports... 2022-08-17T13:16:14.9984211Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131611.xml 2022-08-17T13:16:16.7542334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:16.7542838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:16.7545259Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:16.7545827Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:16.9299248Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:16:16.9314669Z 2022-08-17T13:16:16.9314873Z Running tests... 2022-08-17T13:16:16.9315312Z ---------------------------------------------------------------------- 2022-08-17T13:16:18.4293100Z test_gloo_backend_cpu_module_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:16:18.4489086Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46546 2022-08-17T13:16:18.4495511Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46547 2022-08-17T13:16:19.8393839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:19.8394336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:19.8402647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:19.8403161Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:19.8748351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:19.8748805Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:19.8759020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:19.8759508Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:20.0056003Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:16:20.0418650Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:16:20.0742008Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphgitk99d 2022-08-17T13:16:20.0742555Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7f37lw8a 2022-08-17T13:16:20.0744323Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphgitk99d/_remote_module_non_scriptable.py 2022-08-17T13:16:20.0744902Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7f37lw8a/_remote_module_non_scriptable.py 2022-08-17T13:16:20.0952188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:16:20.0952673Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:16:20.4554071Z ok (3.524s) 2022-08-17T13:16:20.4554386Z 2022-08-17T13:16:20.4554886Z ---------------------------------------------------------------------- 2022-08-17T13:16:20.4555238Z Ran 1 test in 3.524s 2022-08-17T13:16:20.4555407Z 2022-08-17T13:16:20.4555482Z OK 2022-08-17T13:16:20.4555622Z 2022-08-17T13:16:20.4555757Z Generating XML reports... 2022-08-17T13:16:20.4592843Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131616.xml 2022-08-17T13:16:22.2111086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:22.2111886Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:22.2114251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:22.2114748Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:22.3865369Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:16:22.3880997Z 2022-08-17T13:16:22.3881408Z Running tests... 2022-08-17T13:16:22.3881893Z ---------------------------------------------------------------------- 2022-08-17T13:16:22.3897769Z test_ignored_output (__main__.DistributedDataParallelTest) 2022-08-17T13:16:23.8964861Z Test that the output of a model can be ignored and that there is no ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:16:23.9152539Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46659 2022-08-17T13:16:23.9158808Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46660 2022-08-17T13:16:25.3276302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:25.3276791Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:25.3285447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:25.3285932Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:25.3394568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:25.3395041Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:25.3406175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:25.3406664Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:25.4928247Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:16:25.5097853Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:16:25.5344566Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbksuwyz2 2022-08-17T13:16:25.5347333Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbksuwyz2/_remote_module_non_scriptable.py 2022-08-17T13:16:25.5347880Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp701wcudf 2022-08-17T13:16:25.5350623Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp701wcudf/_remote_module_non_scriptable.py 2022-08-17T13:16:25.5592312Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:16:25.5593419Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:16:25.9214572Z ok (3.533s) 2022-08-17T13:16:25.9214808Z 2022-08-17T13:16:25.9215196Z ---------------------------------------------------------------------- 2022-08-17T13:16:25.9215536Z Ran 1 test in 3.533s 2022-08-17T13:16:25.9215700Z 2022-08-17T13:16:25.9215776Z OK 2022-08-17T13:16:25.9215913Z 2022-08-17T13:16:25.9216053Z Generating XML reports... 2022-08-17T13:16:25.9252322Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131622.xml 2022-08-17T13:16:27.6784526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:27.6785016Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:27.6787997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:27.6788510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:27.8535518Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:16:27.8550940Z 2022-08-17T13:16:27.8551318Z Running tests... 2022-08-17T13:16:27.8551792Z ---------------------------------------------------------------------- 2022-08-17T13:16:27.8568260Z test_ignored_output_with_unused_parameters (__main__.DistributedDataParallelTest) 2022-08-17T13:16:29.3628123Z Test that the output of a model can be ignored and that there is no ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:16:29.3884342Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46802 2022-08-17T13:16:29.3889930Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46803 2022-08-17T13:16:30.7834921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:30.7835770Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:30.7844009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:30.7844497Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:30.8065863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:30.8066337Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:30.8077929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:30.8078419Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:30.9484747Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:16:30.9766628Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:16:31.0083112Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm9fsfwrm 2022-08-17T13:16:31.0086059Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm9fsfwrm/_remote_module_non_scriptable.py 2022-08-17T13:16:31.0086618Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqy6vatm9 2022-08-17T13:16:31.0088918Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqy6vatm9/_remote_module_non_scriptable.py 2022-08-17T13:16:31.3945698Z ok (3.539s) 2022-08-17T13:16:31.3945915Z 2022-08-17T13:16:31.3946309Z ---------------------------------------------------------------------- 2022-08-17T13:16:31.3946663Z Ran 1 test in 3.539s 2022-08-17T13:16:31.3946815Z 2022-08-17T13:16:31.3946918Z OK 2022-08-17T13:16:31.3947058Z 2022-08-17T13:16:31.3947193Z Generating XML reports... 2022-08-17T13:16:31.3982286Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131627.xml 2022-08-17T13:16:33.1654037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:33.1654542Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:33.1657165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:33.1657656Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:33.3412309Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:16:33.3427920Z 2022-08-17T13:16:33.3428161Z Running tests... 2022-08-17T13:16:33.3428584Z ---------------------------------------------------------------------- 2022-08-17T13:16:34.8607618Z test_invalid_powerSGD_state (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:16:34.8804257Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46945 2022-08-17T13:16:34.8811134Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46946 2022-08-17T13:16:36.2741500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:36.2742010Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:36.2742610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:36.2743063Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:36.2750809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:36.2751279Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:36.2751874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:36.2752491Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:36.4455553Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:16:36.4456041Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:16:36.4461198Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:16:36.4462300Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:16:36.4463687Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:16:36.4464759Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:16:36.4465830Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:16:36.4466883Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:16:36.4467936Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:16:36.4469183Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:16:36.4470265Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:16:36.4471294Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:16:36.4472433Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:16:36.4473478Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:16:36.7864883Z ok (3.443s) 2022-08-17T13:16:36.7865091Z 2022-08-17T13:16:36.7865468Z ---------------------------------------------------------------------- 2022-08-17T13:16:36.7865785Z Ran 1 test in 3.444s 2022-08-17T13:16:36.7865971Z 2022-08-17T13:16:36.7867542Z OK 2022-08-17T13:16:36.7868414Z 2022-08-17T13:16:36.7868572Z Generating XML reports... 2022-08-17T13:16:36.7902558Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131633.xml 2022-08-17T13:16:38.5762192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:38.5762722Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:38.5765158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:38.5765642Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:38.7527735Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:16:38.7543190Z 2022-08-17T13:16:38.7543433Z Running tests... 2022-08-17T13:16:38.7543863Z ---------------------------------------------------------------------- 2022-08-17T13:16:40.2766899Z test_save_load_checkpoint (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:16:40.2965853Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47048 2022-08-17T13:16:40.2972236Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47049 2022-08-17T13:16:41.6897364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:41.6897860Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:41.6906799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:41.6907287Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:41.7180214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:41.7180691Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:41.7192212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:41.7192942Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:41.8552055Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:16:41.8883285Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:16:41.9095934Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:16:41.9096719Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:16:41.9097432Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:16:41.9098358Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:16:43.2076914Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphmto61c2 2022-08-17T13:16:43.2077544Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphmto61c2/_remote_module_non_scriptable.py 2022-08-17T13:16:43.2366791Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqtck9zou 2022-08-17T13:16:43.2369898Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqtck9zou/_remote_module_non_scriptable.py 2022-08-17T13:16:43.2570162Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:16:43.2570719Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:16:43.2571425Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:16:43.2571989Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:16:43.6680337Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:16:43.6680851Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:16:43.6816292Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:16:43.6816781Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:16:44.1071322Z ok (5.352s) 2022-08-17T13:16:44.1071535Z 2022-08-17T13:16:44.1071922Z ---------------------------------------------------------------------- 2022-08-17T13:16:44.1072245Z Ran 1 test in 5.353s 2022-08-17T13:16:44.1072413Z 2022-08-17T13:16:44.1072508Z OK 2022-08-17T13:16:44.1072647Z 2022-08-17T13:16:44.1072783Z Generating XML reports... 2022-08-17T13:16:44.1108558Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131638.xml 2022-08-17T13:16:45.9048065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:45.9048550Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:45.9051292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:45.9051776Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:46.0804292Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:16:46.0819371Z 2022-08-17T13:16:46.0819805Z Running tests... 2022-08-17T13:16:46.0820290Z ---------------------------------------------------------------------- 2022-08-17T13:16:47.5976926Z test_sparse_gradients (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:16:47.6173499Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47163 2022-08-17T13:16:47.6180085Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47164 2022-08-17T13:16:49.0437141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:49.0437635Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:49.0447334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:49.0447819Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:49.0779695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:49.0780158Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:49.0791022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:49.0791503Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:49.2145148Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:16:49.2447452Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:16:49.2769625Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqg2nurq1 2022-08-17T13:16:49.2771608Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy84xycos 2022-08-17T13:16:49.2772156Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqg2nurq1/_remote_module_non_scriptable.py 2022-08-17T13:16:49.2774641Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy84xycos/_remote_module_non_scriptable.py 2022-08-17T13:16:49.6235477Z ok (3.541s) 2022-08-17T13:16:49.6235823Z 2022-08-17T13:16:49.6236447Z ---------------------------------------------------------------------- 2022-08-17T13:16:49.6237025Z Ran 1 test in 3.542s 2022-08-17T13:16:49.6237304Z 2022-08-17T13:16:49.6237449Z OK 2022-08-17T13:16:49.6237698Z 2022-08-17T13:16:49.6237911Z Generating XML reports... 2022-08-17T13:16:49.6275904Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131646.xml 2022-08-17T13:16:51.3911818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:51.3912328Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:51.3914633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:51.3915128Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:51.5679472Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:16:51.5694149Z 2022-08-17T13:16:51.5694555Z Running tests... 2022-08-17T13:16:51.5695055Z ---------------------------------------------------------------------- 2022-08-17T13:16:53.0918298Z test_sparse_gradients_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:16:53.1107581Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47306 2022-08-17T13:16:53.1113782Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47307 2022-08-17T13:16:54.5084238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:54.5084756Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:54.5093214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:54.5093716Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:54.5358816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:54.5359612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:54.5370895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:54.5371373Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:54.6743055Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:16:54.7064245Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:16:54.7285246Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpim8eb9mg 2022-08-17T13:16:54.7287790Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpim8eb9mg/_remote_module_non_scriptable.py 2022-08-17T13:16:54.7288514Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpphguxs1w 2022-08-17T13:16:54.7291192Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpphguxs1w/_remote_module_non_scriptable.py 2022-08-17T13:16:55.1170150Z ok (3.547s) 2022-08-17T13:16:55.1170332Z 2022-08-17T13:16:55.1170691Z ---------------------------------------------------------------------- 2022-08-17T13:16:55.1171030Z Ran 1 test in 3.548s 2022-08-17T13:16:55.1171199Z 2022-08-17T13:16:55.1171296Z OK 2022-08-17T13:16:55.1171438Z 2022-08-17T13:16:55.1171572Z Generating XML reports... 2022-08-17T13:16:55.1206550Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131651.xml 2022-08-17T13:16:56.8834758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:16:56.8835275Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:16:56.8837925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:16:56.8838418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:16:57.0613159Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:16:57.0628235Z 2022-08-17T13:16:57.0628542Z Running tests... 2022-08-17T13:16:57.0628995Z ---------------------------------------------------------------------- 2022-08-17T13:16:58.5688399Z test_sync_batch_norm_empty_input (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:16:58.5877683Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47449 2022-08-17T13:16:58.5884501Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47450 2022-08-17T13:17:00.0114163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:00.0115188Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:00.0125089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:00.0126053Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:00.0286040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:00.0286992Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:00.0297930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:00.0298893Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:00.1790963Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:17:00.2022510Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:17:01.5152967Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpekpqfhmv 2022-08-17T13:17:01.5154412Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpekpqfhmv/_remote_module_non_scriptable.py 2022-08-17T13:17:01.5402985Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv0lex7cn 2022-08-17T13:17:01.5405482Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv0lex7cn/_remote_module_non_scriptable.py 2022-08-17T13:17:01.5595968Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:17:01.5597114Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:17:01.5598401Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:17:01.5599693Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:17:02.7810531Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:17:02.7811542Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:17:03.4004178Z ok (6.337s) 2022-08-17T13:17:03.4004423Z 2022-08-17T13:17:03.4004829Z ---------------------------------------------------------------------- 2022-08-17T13:17:03.4005168Z Ran 1 test in 6.337s 2022-08-17T13:17:03.4005336Z 2022-08-17T13:17:03.4005428Z OK 2022-08-17T13:17:03.4005564Z 2022-08-17T13:17:03.4005681Z Generating XML reports... 2022-08-17T13:17:03.4040530Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131657.xml 2022-08-17T13:17:05.1486742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:05.1487261Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:05.1489879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:05.1490377Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:05.3242397Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:17:05.3257614Z 2022-08-17T13:17:05.3257885Z Running tests... 2022-08-17T13:17:05.3258328Z ---------------------------------------------------------------------- 2022-08-17T13:17:06.8325295Z test_sync_batch_norm_only_empty_input (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:17:06.8520805Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47564 2022-08-17T13:17:06.8528031Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47565 2022-08-17T13:17:08.2786496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:08.2787016Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:08.2796725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:08.2797274Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:08.3247426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:08.3248030Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:08.3258986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:08.3259750Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:08.4450625Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:17:08.4950913Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:17:09.7985717Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2d3g38qk 2022-08-17T13:17:09.7986719Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2d3g38qk/_remote_module_non_scriptable.py 2022-08-17T13:17:09.8138796Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp22z02nco 2022-08-17T13:17:09.8141389Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp22z02nco/_remote_module_non_scriptable.py 2022-08-17T13:17:09.8327910Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:17:09.8328708Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:17:09.8329673Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:17:09.8330275Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:17:10.4458788Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:17:10.9631361Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:17:10.9632194Z ok (5.637s) 2022-08-17T13:17:10.9632380Z 2022-08-17T13:17:10.9632756Z ---------------------------------------------------------------------- 2022-08-17T13:17:10.9633099Z Ran 1 test in 5.637s 2022-08-17T13:17:10.9633265Z 2022-08-17T13:17:10.9633361Z OK 2022-08-17T13:17:10.9633498Z 2022-08-17T13:17:10.9633626Z Generating XML reports... 2022-08-17T13:17:10.9667450Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131705.xml 2022-08-17T13:17:12.7362757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:12.7363741Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:12.7365628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:12.7366547Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:12.9130956Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:17:12.9147851Z 2022-08-17T13:17:12.9148382Z Running tests... 2022-08-17T13:17:12.9148908Z ---------------------------------------------------------------------- 2022-08-17T13:17:14.4312313Z test_allgather_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:17:14.4509972Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47679 2022-08-17T13:17:14.4517139Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47680 2022-08-17T13:17:14.4523881Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 47681 2022-08-17T13:17:14.4530351Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 47682 2022-08-17T13:17:15.8667571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:15.8668098Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:15.8677017Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:15.8677509Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:15.9333442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:15.9333965Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:15.9344230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:15.9344988Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:15.9388280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:15.9388740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:15.9399401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:15.9399872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:15.9675282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:15.9675739Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:15.9686414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:15.9686897Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:16.0325246Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:17:16.1021346Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:17:16.1052834Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:17:16.1350233Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:17:16.5595581Z ok (3.644s) 2022-08-17T13:17:16.5595779Z 2022-08-17T13:17:16.5596176Z ---------------------------------------------------------------------- 2022-08-17T13:17:16.5596523Z Ran 1 test in 3.645s 2022-08-17T13:17:16.5596690Z 2022-08-17T13:17:16.5596812Z OK 2022-08-17T13:17:16.5596950Z 2022-08-17T13:17:16.5597072Z Generating XML reports... 2022-08-17T13:17:16.5632954Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131712.xml 2022-08-17T13:17:18.2902603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:18.2903101Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:18.2905620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:18.2906111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:18.4623629Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:17:18.4639225Z 2022-08-17T13:17:18.4639370Z Running tests... 2022-08-17T13:17:18.4640074Z ---------------------------------------------------------------------- 2022-08-17T13:17:19.9368821Z test_allgather_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:17:19.9557400Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47862 2022-08-17T13:17:19.9563061Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47863 2022-08-17T13:17:19.9569494Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 47864 2022-08-17T13:17:19.9575494Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 47865 2022-08-17T13:17:21.4276135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:21.4277120Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:21.4285636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:21.4286580Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:21.4542800Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:21.4544383Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:21.4555694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:21.4556683Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:21.4828653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:21.4829635Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:21.4839653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:21.4840635Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:21.5157959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:21.5158919Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:21.5169513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:21.5170515Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:21.5956837Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:17:21.6230334Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:17:21.6492536Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:17:21.6858472Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:17:23.8693682Z ok (5.405s) 2022-08-17T13:17:23.8693928Z 2022-08-17T13:17:23.8694309Z ---------------------------------------------------------------------- 2022-08-17T13:17:23.8694657Z Ran 1 test in 5.405s 2022-08-17T13:17:23.8694825Z 2022-08-17T13:17:23.8694902Z OK 2022-08-17T13:17:23.8695039Z 2022-08-17T13:17:23.8695185Z Generating XML reports... 2022-08-17T13:17:23.8734573Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131718.xml 2022-08-17T13:17:25.6577332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:25.6577827Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:25.6580864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:25.6581347Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:25.8338262Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:17:25.8353947Z 2022-08-17T13:17:25.8354193Z Running tests... 2022-08-17T13:17:25.8354616Z ---------------------------------------------------------------------- 2022-08-17T13:17:27.3379983Z test_allgather_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:17:27.3576723Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48049 2022-08-17T13:17:27.3582738Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48050 2022-08-17T13:17:27.3589837Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48051 2022-08-17T13:17:27.3595838Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48052 2022-08-17T13:17:28.7858136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:28.7858693Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:28.7868133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:28.7868628Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:28.7998138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:28.7998641Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:28.8009248Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:28.8010075Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:28.8010669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:28.8011123Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:28.8022348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:28.8022995Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:28.8146356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:28.8146831Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:28.8157380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:28.8157861Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:28.9558553Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:17:28.9703283Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:17:28.9719753Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:17:28.9792287Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:17:29.4660629Z ok (3.630s) 2022-08-17T13:17:29.4660853Z 2022-08-17T13:17:29.4661265Z ---------------------------------------------------------------------- 2022-08-17T13:17:29.4661611Z Ran 1 test in 3.631s 2022-08-17T13:17:29.4661758Z 2022-08-17T13:17:29.4661851Z OK 2022-08-17T13:17:29.4662999Z 2022-08-17T13:17:29.4663309Z Generating XML reports... 2022-08-17T13:17:29.4699197Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131725.xml 2022-08-17T13:17:31.2647566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:31.2650068Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:31.2650676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:31.2651184Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:31.4407349Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:17:31.4421769Z 2022-08-17T13:17:31.4422245Z Running tests... 2022-08-17T13:17:31.4422729Z ---------------------------------------------------------------------- 2022-08-17T13:17:32.9674958Z test_allgather_coalesced_async (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:17:32.9872133Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48232 2022-08-17T13:17:32.9878314Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48233 2022-08-17T13:17:32.9884816Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48234 2022-08-17T13:17:32.9891141Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48235 2022-08-17T13:17:34.4002556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:34.4003503Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:34.4012056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:34.4013028Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:34.4132676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:34.4133631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:34.4137933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:34.4138873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:34.4144059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:34.4145271Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:34.4152308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:34.4153296Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:34.4376137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:34.4377096Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:34.4388030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:34.4389012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:34.5659543Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:17:34.5852082Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:17:34.5862203Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:17:34.6079591Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:17:34.6279456Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:17:34.6381253Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:17:34.6485000Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-08-17T13:17:34.6485567Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-08-17T13:17:34.6486730Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:17:34.6487490Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:17:34.6586925Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:17:34.6588283Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:17:35.0954725Z ok (3.653s) 2022-08-17T13:17:35.0954946Z 2022-08-17T13:17:35.0955341Z ---------------------------------------------------------------------- 2022-08-17T13:17:35.0955682Z Ran 1 test in 3.653s 2022-08-17T13:17:35.0955849Z 2022-08-17T13:17:35.0955926Z OK 2022-08-17T13:17:35.0956063Z 2022-08-17T13:17:35.0956201Z Generating XML reports... 2022-08-17T13:17:35.0991458Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131731.xml 2022-08-17T13:17:36.8527759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:36.8528293Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:36.8530548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:36.8531060Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:37.0289726Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:17:37.0304994Z 2022-08-17T13:17:37.0305465Z Running tests... 2022-08-17T13:17:37.0305963Z ---------------------------------------------------------------------- 2022-08-17T13:17:38.5464226Z test_allgather_coalesced_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:17:38.5659755Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48415 2022-08-17T13:17:38.5666407Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48416 2022-08-17T13:17:38.5672686Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48417 2022-08-17T13:17:38.5678969Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48418 2022-08-17T13:17:40.0033961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:40.0034990Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:40.0045126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:40.0046115Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:40.0123320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:40.0124149Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:40.0134755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:40.0135593Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:40.0211748Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:40.0212662Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:40.0223546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:40.0224788Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:40.0228721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:40.0229613Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:40.0241412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:40.0242401Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:40.1734061Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:17:40.1794851Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:17:40.1911834Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:17:40.1987491Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:17:40.6742809Z ok (3.643s) 2022-08-17T13:17:40.6742991Z 2022-08-17T13:17:40.6743389Z ---------------------------------------------------------------------- 2022-08-17T13:17:40.6743729Z Ran 1 test in 3.644s 2022-08-17T13:17:40.6744136Z 2022-08-17T13:17:40.6744238Z OK 2022-08-17T13:17:40.6744379Z 2022-08-17T13:17:40.6744512Z Generating XML reports... 2022-08-17T13:17:40.6780490Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131737.xml 2022-08-17T13:17:42.4840784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:42.4841333Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:42.4843658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:42.4844131Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:42.6607280Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:17:42.6622475Z 2022-08-17T13:17:42.6622920Z Running tests... 2022-08-17T13:17:42.6623554Z ---------------------------------------------------------------------- 2022-08-17T13:17:44.1812202Z test_allgather_noncontiguous_input (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:17:44.2008423Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48598 2022-08-17T13:17:44.2014766Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48599 2022-08-17T13:17:44.2021153Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48600 2022-08-17T13:17:44.2028368Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48601 2022-08-17T13:17:45.6155345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:45.6155861Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:45.6165146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:45.6165641Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:45.6460663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:45.6461193Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:45.6471969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:45.6472454Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:45.6698176Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:45.6698640Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:45.6709778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:45.6710259Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:45.7011548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:45.7012033Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:45.7023285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:45.7023770Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:45.7820307Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:17:45.8127958Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:17:45.8358423Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:17:45.8699669Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:17:46.3092539Z ok (3.647s) 2022-08-17T13:17:46.3092737Z 2022-08-17T13:17:46.3093128Z ---------------------------------------------------------------------- 2022-08-17T13:17:46.3093490Z Ran 1 test in 3.647s 2022-08-17T13:17:46.3093654Z 2022-08-17T13:17:46.3093747Z OK 2022-08-17T13:17:46.3093887Z 2022-08-17T13:17:46.3094007Z Generating XML reports... 2022-08-17T13:17:46.3131146Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131742.xml 2022-08-17T13:17:48.1054577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:48.1057089Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:48.1057706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:48.1058382Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:48.2801875Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:17:48.2816651Z 2022-08-17T13:17:48.2816999Z Running tests... 2022-08-17T13:17:48.2817444Z ---------------------------------------------------------------------- 2022-08-17T13:17:49.7888813Z test_allgather_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:17:49.8084328Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48781 2022-08-17T13:17:49.8090525Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48782 2022-08-17T13:17:49.8096628Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48783 2022-08-17T13:17:49.8103078Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48784 2022-08-17T13:17:51.2214275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:51.2214775Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:51.2223237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:51.2223736Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:51.2238059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:51.2238503Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:51.2249433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:51.2249913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:51.2544240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:51.2544702Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:51.2555417Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:51.2555901Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:51.3013292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:51.3013757Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:51.3024692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:51.3025169Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:51.3918285Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:17:51.3939214Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:17:51.4193842Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:17:51.4725234Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:17:52.5182694Z ok (4.236s) 2022-08-17T13:17:52.5182912Z 2022-08-17T13:17:52.5183516Z ---------------------------------------------------------------------- 2022-08-17T13:17:52.5184165Z Ran 1 test in 4.236s 2022-08-17T13:17:52.5184339Z 2022-08-17T13:17:52.5184447Z OK 2022-08-17T13:17:52.5184583Z 2022-08-17T13:17:52.5184719Z Generating XML reports... 2022-08-17T13:17:52.5218661Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131748.xml 2022-08-17T13:17:54.3120023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:54.3122814Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:54.3123438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:54.3123917Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:54.4875449Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:17:54.4889552Z 2022-08-17T13:17:54.4889782Z Running tests... 2022-08-17T13:17:54.4890228Z ---------------------------------------------------------------------- 2022-08-17T13:17:55.9971353Z test_allgather_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:17:56.0171036Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48988 2022-08-17T13:17:56.0176716Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48989 2022-08-17T13:17:56.0183079Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48990 2022-08-17T13:17:56.0189832Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48991 2022-08-17T13:17:57.4311442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:57.4311958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:57.4320647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:57.4321143Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:57.4338210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:57.4338662Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:57.4349910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:57.4350389Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:57.4532460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:57.4532910Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:57.4543761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:57.4544244Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:57.4965277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:17:57.4965982Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:17:57.4977016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:17:57.4977488Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:17:57.6005953Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:17:57.6069567Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:17:57.6197004Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:17:57.6702015Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:18:01.5337060Z ok (7.044s) 2022-08-17T13:18:01.5337460Z 2022-08-17T13:18:01.5338184Z ---------------------------------------------------------------------- 2022-08-17T13:18:01.5338671Z Ran 1 test in 7.045s 2022-08-17T13:18:01.5338843Z 2022-08-17T13:18:01.5338938Z OK 2022-08-17T13:18:01.5339072Z 2022-08-17T13:18:01.5341667Z Generating XML reports... 2022-08-17T13:18:01.5374356Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131754.xml 2022-08-17T13:18:03.2892156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:03.2892679Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:03.2894964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:03.2895455Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:03.4579414Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:18:03.4594384Z 2022-08-17T13:18:03.4594618Z Running tests... 2022-08-17T13:18:03.4595059Z ---------------------------------------------------------------------- 2022-08-17T13:18:04.9199888Z test_allreduce_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:18:04.9388555Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49199 2022-08-17T13:18:04.9394622Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49200 2022-08-17T13:18:04.9400734Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 49201 2022-08-17T13:18:04.9406807Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 49202 2022-08-17T13:18:06.4376008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:06.4385386Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:06.4386003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:06.4386470Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:06.4539877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:06.4540345Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:06.4551243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:06.4551711Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:06.4844666Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:06.4845138Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:06.4856518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:06.4856985Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:06.5140875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:06.5141339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:06.5152133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:06.5152600Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:06.6049587Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:18:06.6212724Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:18:06.6545241Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:18:06.6805220Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:18:07.1475023Z ok (3.688s) 2022-08-17T13:18:07.1475208Z 2022-08-17T13:18:07.1475592Z ---------------------------------------------------------------------- 2022-08-17T13:18:07.1475912Z Ran 1 test in 3.688s 2022-08-17T13:18:07.1476084Z 2022-08-17T13:18:07.1476181Z OK 2022-08-17T13:18:07.1476325Z 2022-08-17T13:18:07.1476457Z Generating XML reports... 2022-08-17T13:18:07.1513237Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131803.xml 2022-08-17T13:18:08.9416167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:08.9416645Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:08.9419614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:08.9420083Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:09.1176132Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:18:09.1192119Z 2022-08-17T13:18:09.1192545Z Running tests... 2022-08-17T13:18:09.1193030Z ---------------------------------------------------------------------- 2022-08-17T13:18:10.6296081Z test_allreduce_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:18:10.6492227Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49382 2022-08-17T13:18:10.6498570Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49383 2022-08-17T13:18:10.6505000Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 49384 2022-08-17T13:18:10.6512040Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 49385 2022-08-17T13:18:12.0888976Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:12.0889484Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:12.0893973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:12.0894434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:12.0898130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:12.0898622Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:12.0905150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:12.0905639Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:12.1057163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:12.1057636Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:12.1068401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:12.1068887Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:12.1263554Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:12.1264013Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:12.1274695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:12.1275193Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:12.2609520Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:18:12.2612964Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:18:12.2722966Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:18:12.2901314Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:18:14.3617671Z ok (5.242s) 2022-08-17T13:18:14.3617895Z 2022-08-17T13:18:14.3618291Z ---------------------------------------------------------------------- 2022-08-17T13:18:14.3618632Z Ran 1 test in 5.242s 2022-08-17T13:18:14.3618800Z 2022-08-17T13:18:14.3618877Z OK 2022-08-17T13:18:14.3619020Z 2022-08-17T13:18:14.3619525Z Generating XML reports... 2022-08-17T13:18:14.3657205Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131809.xml 2022-08-17T13:18:16.1522272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:16.1523233Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:16.1525645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:16.1526589Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:16.3283394Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:18:16.3299977Z 2022-08-17T13:18:16.3300495Z Running tests... 2022-08-17T13:18:16.3301064Z ---------------------------------------------------------------------- 2022-08-17T13:18:17.8444469Z test_allreduce_basics_cuda_using_work_api (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:18:17.8642896Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49569 2022-08-17T13:18:17.8649430Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49570 2022-08-17T13:18:17.8656752Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 49571 2022-08-17T13:18:17.8663528Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 49572 2022-08-17T13:18:19.2713126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:19.2714093Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:19.2724124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:19.2725057Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:19.2773689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:19.2774568Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:19.2785923Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:19.2786774Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:19.3002500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:19.3003430Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:19.3013514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:19.3014471Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:19.3531682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:19.3532636Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:19.3543121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:19.3544394Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:19.4389362Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:18:19.4469712Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:18:19.4669438Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:18:19.5248341Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:18:21.5766034Z ok (5.246s) 2022-08-17T13:18:21.5766243Z 2022-08-17T13:18:21.5766654Z ---------------------------------------------------------------------- 2022-08-17T13:18:21.5767361Z Ran 1 test in 5.247s 2022-08-17T13:18:21.5767518Z 2022-08-17T13:18:21.5767613Z OK 2022-08-17T13:18:21.5767753Z 2022-08-17T13:18:21.5767896Z Generating XML reports... 2022-08-17T13:18:21.5803054Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131816.xml 2022-08-17T13:18:23.3465073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:23.3478756Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:23.3479507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:23.3480025Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:23.5218061Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:18:23.5233322Z 2022-08-17T13:18:23.5233527Z Running tests... 2022-08-17T13:18:23.5233970Z ---------------------------------------------------------------------- 2022-08-17T13:18:25.0238620Z test_allreduce_basics_using_work_api (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:18:25.0434060Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49756 2022-08-17T13:18:25.0439948Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49757 2022-08-17T13:18:25.0446170Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 49758 2022-08-17T13:18:25.0452864Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 49759 2022-08-17T13:18:26.4637607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:26.4638142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:26.4647368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:26.4647877Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:26.4668522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:26.4669057Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:26.4680920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:26.4681404Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:26.5062346Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:26.5062840Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:26.5073573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:26.5074066Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:26.5379112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:26.5379573Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:26.5390919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:26.5391399Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:26.6308819Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:18:26.6382562Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:18:26.6718637Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:18:26.7086267Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:18:27.1515588Z ok (3.628s) 2022-08-17T13:18:27.1515773Z 2022-08-17T13:18:27.1516463Z ---------------------------------------------------------------------- 2022-08-17T13:18:27.1516842Z Ran 1 test in 3.628s 2022-08-17T13:18:27.1517012Z 2022-08-17T13:18:27.1517108Z OK 2022-08-17T13:18:27.1517251Z 2022-08-17T13:18:27.1517388Z Generating XML reports... 2022-08-17T13:18:27.1552524Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131823.xml 2022-08-17T13:18:28.9091497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:28.9092035Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:28.9095004Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:28.9095509Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:29.0842795Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:18:29.0857992Z 2022-08-17T13:18:29.0858387Z Running tests... 2022-08-17T13:18:29.0858818Z ---------------------------------------------------------------------- 2022-08-17T13:18:30.5948469Z test_allreduce_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:18:30.6138004Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49939 2022-08-17T13:18:30.6144639Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49940 2022-08-17T13:18:30.6150959Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 49941 2022-08-17T13:18:30.6157250Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 49942 2022-08-17T13:18:32.0295014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:32.0295518Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:32.0305147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:32.0305637Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:32.0313072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:32.0313538Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:32.0324446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:32.0324928Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:32.0616663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:32.0617134Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:32.0628091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:32.0628818Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:32.1116646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:32.1117112Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:32.1127558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:32.1128035Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:32.2044577Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:18:32.2054690Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:18:32.2263348Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:18:32.2778074Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:18:32.7220998Z ok (3.636s) 2022-08-17T13:18:32.7221328Z 2022-08-17T13:18:32.7221725Z ---------------------------------------------------------------------- 2022-08-17T13:18:32.7222052Z Ran 1 test in 3.636s 2022-08-17T13:18:32.7222219Z 2022-08-17T13:18:32.7222312Z OK 2022-08-17T13:18:32.7222456Z 2022-08-17T13:18:32.7222591Z Generating XML reports... 2022-08-17T13:18:32.7257126Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131829.xml 2022-08-17T13:18:34.5029375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:34.5029876Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:34.5032608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:34.5033199Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:34.6807349Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:18:34.6822744Z 2022-08-17T13:18:34.6822992Z Running tests... 2022-08-17T13:18:34.6823418Z ---------------------------------------------------------------------- 2022-08-17T13:18:36.1942862Z test_allreduce_coalesced_async (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:18:36.2140293Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50122 2022-08-17T13:18:36.2146677Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50123 2022-08-17T13:18:36.2152783Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50124 2022-08-17T13:18:36.2159171Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50125 2022-08-17T13:18:37.6227189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:37.6227705Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:37.6237099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:37.6237571Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:37.6551656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:37.6552127Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:37.6562696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:37.6563213Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:37.6578822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:37.6579591Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:37.6590618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:37.6591081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:37.6850638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:37.6851101Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:37.6861824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:37.6862305Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:37.7910309Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:18:37.8240097Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:18:37.8253900Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:18:37.8503122Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:18:37.8832822Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:18:37.8833343Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:18:37.8833836Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-08-17T13:18:37.8834330Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-08-17T13:18:37.8835098Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:18:37.8835944Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:18:37.8837205Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:18:37.8838510Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:18:38.3224089Z ok (3.640s) 2022-08-17T13:18:38.3224299Z 2022-08-17T13:18:38.3224702Z ---------------------------------------------------------------------- 2022-08-17T13:18:38.3225044Z Ran 1 test in 3.640s 2022-08-17T13:18:38.3225211Z 2022-08-17T13:18:38.3225287Z OK 2022-08-17T13:18:38.3225425Z 2022-08-17T13:18:38.3225562Z Generating XML reports... 2022-08-17T13:18:38.3261926Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131834.xml 2022-08-17T13:18:40.0541252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:40.0541821Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:40.0543913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:40.0544407Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:40.2246799Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:18:40.2260792Z 2022-08-17T13:18:40.2261185Z Running tests... 2022-08-17T13:18:40.2261645Z ---------------------------------------------------------------------- 2022-08-17T13:18:41.6911557Z test_allreduce_coalesced_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:18:41.7097783Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50305 2022-08-17T13:18:41.7103346Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50306 2022-08-17T13:18:41.7110028Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50307 2022-08-17T13:18:41.7115647Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50308 2022-08-17T13:18:43.1494543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:43.1495095Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:43.1505650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:43.1506127Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:43.1741852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:43.1742678Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:43.1753073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:43.1753543Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:43.1810331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:43.1810932Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:43.1821791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:43.1822458Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:43.1899544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:43.1900013Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:43.1911325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:43.1911995Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:43.3241581Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:18:43.3434080Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:18:43.3510037Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:18:43.3566830Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:18:43.8178731Z ok (3.591s) 2022-08-17T13:18:43.8178969Z 2022-08-17T13:18:43.8179356Z ---------------------------------------------------------------------- 2022-08-17T13:18:43.8179743Z Ran 1 test in 3.592s 2022-08-17T13:18:43.8179913Z 2022-08-17T13:18:43.8180009Z OK 2022-08-17T13:18:43.8180129Z 2022-08-17T13:18:43.8180268Z Generating XML reports... 2022-08-17T13:18:43.8215853Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131840.xml 2022-08-17T13:18:45.5699107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:45.5699637Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:45.5702185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:45.5702674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:45.7455448Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:18:45.7471075Z 2022-08-17T13:18:45.7471223Z Running tests... 2022-08-17T13:18:45.7472130Z ---------------------------------------------------------------------- 2022-08-17T13:18:47.2554225Z test_allreduce_coalesced_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:18:47.2741473Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50488 2022-08-17T13:18:47.2748950Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50489 2022-08-17T13:18:47.2755477Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50490 2022-08-17T13:18:47.2762420Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50491 2022-08-17T13:18:48.6843189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:48.6843708Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:48.6853130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:48.6854070Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:48.7070256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:48.7070947Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:48.7081408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:48.7082133Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:48.7500393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:48.7501044Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:48.7511666Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:48.7512424Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:48.7647662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:48.7648370Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:48.7659287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:48.7659966Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:48.8533401Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:18:48.8731758Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:18:48.9162565Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:18:48.9361529Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:18:49.3825577Z ok (3.635s) 2022-08-17T13:18:49.3825799Z 2022-08-17T13:18:49.3826208Z ---------------------------------------------------------------------- 2022-08-17T13:18:49.3826534Z Ran 1 test in 3.635s 2022-08-17T13:18:49.3826704Z 2022-08-17T13:18:49.3826822Z OK 2022-08-17T13:18:49.3830193Z 2022-08-17T13:18:49.3830552Z Generating XML reports... 2022-08-17T13:18:49.3862817Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131845.xml 2022-08-17T13:18:51.1279934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:51.1280431Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:51.1282626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:51.1283117Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:51.2969385Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:18:51.2983505Z 2022-08-17T13:18:51.2983973Z Running tests... 2022-08-17T13:18:51.2984744Z ---------------------------------------------------------------------- 2022-08-17T13:18:52.7776048Z test_allreduce_coalesced_checks_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:18:52.7965914Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50671 2022-08-17T13:18:52.7972070Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50672 2022-08-17T13:18:52.7978052Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50673 2022-08-17T13:18:52.7984310Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50674 2022-08-17T13:18:54.2106061Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:54.2106939Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:54.2115825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:54.2116315Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:54.2180665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:54.2181133Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:54.2181701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:54.2182134Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:54.2192729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:54.2193211Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:54.2193813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:54.2194270Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:54.2370286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:54.2370744Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:54.2382336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:54.2382796Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:54.3776574Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:18:54.3899446Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:18:54.3915486Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:18:54.4081222Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:18:56.5084900Z ok (5.210s) 2022-08-17T13:18:56.5085265Z 2022-08-17T13:18:56.5085856Z ---------------------------------------------------------------------- 2022-08-17T13:18:56.5086204Z Ran 1 test in 5.210s 2022-08-17T13:18:56.5086368Z 2022-08-17T13:18:56.5086461Z OK 2022-08-17T13:18:56.5086577Z 2022-08-17T13:18:56.5086711Z Generating XML reports... 2022-08-17T13:18:56.5122180Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131851.xml 2022-08-17T13:18:58.2842055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:18:58.2842562Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:18:58.2844560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:18:58.2845071Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:18:58.4587808Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:18:58.4602063Z 2022-08-17T13:18:58.4602456Z Running tests... 2022-08-17T13:18:58.4602921Z ---------------------------------------------------------------------- 2022-08-17T13:18:59.9691674Z test_allreduce_coalesced_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:18:59.9886508Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50858 2022-08-17T13:18:59.9892694Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50859 2022-08-17T13:18:59.9898873Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50860 2022-08-17T13:18:59.9905427Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50861 2022-08-17T13:19:01.3957107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:01.3957606Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:01.3966948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:01.3967436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:01.4052130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:01.4052577Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:01.4063507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:01.4064019Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:01.4331402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:01.4331868Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:01.4337010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:01.4337462Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:01.4342404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:01.4342884Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:01.4348453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:01.4348930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:01.5620580Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:19:01.5717459Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:19:01.6057449Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:19:01.6057934Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:19:02.3977573Z ok (3.937s) 2022-08-17T13:19:02.3977780Z 2022-08-17T13:19:02.3978158Z ---------------------------------------------------------------------- 2022-08-17T13:19:02.3978517Z Ran 1 test in 3.937s 2022-08-17T13:19:02.3978683Z 2022-08-17T13:19:02.3978775Z OK 2022-08-17T13:19:02.3978909Z 2022-08-17T13:19:02.3979045Z Generating XML reports... 2022-08-17T13:19:02.4015365Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131858.xml 2022-08-17T13:19:04.1761135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:04.1761657Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:04.1764501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:04.1765012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:04.3496583Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:19:04.3512139Z 2022-08-17T13:19:04.3512550Z Running tests... 2022-08-17T13:19:04.3513054Z ---------------------------------------------------------------------- 2022-08-17T13:19:05.8545717Z test_allreduce_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:19:05.8737466Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51065 2022-08-17T13:19:05.8743568Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51066 2022-08-17T13:19:05.8750662Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 51067 2022-08-17T13:19:05.8757133Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 51068 2022-08-17T13:19:07.2921627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:07.2922140Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:07.2930364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:07.2930811Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:07.2931399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:07.2931872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:07.2934980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:07.2935433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:07.2941764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:07.2942248Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:07.2946373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:07.2946828Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:07.3115736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:07.3116201Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:07.3127146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:07.3127613Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:07.4701195Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:19:07.4701713Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:19:07.4721334Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:19:07.4828650Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:19:08.1826111Z ok (3.831s) 2022-08-17T13:19:08.1826328Z 2022-08-17T13:19:08.1826719Z ---------------------------------------------------------------------- 2022-08-17T13:19:08.1827174Z Ran 1 test in 3.831s 2022-08-17T13:19:08.1827457Z 2022-08-17T13:19:08.1827572Z OK 2022-08-17T13:19:08.1827719Z 2022-08-17T13:19:08.1827862Z Generating XML reports... 2022-08-17T13:19:08.1862791Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131904.xml 2022-08-17T13:19:09.9251507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:09.9252044Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:09.9253572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:09.9254050Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:10.0950517Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:19:10.0964782Z 2022-08-17T13:19:10.0965020Z Running tests... 2022-08-17T13:19:10.0965516Z ---------------------------------------------------------------------- 2022-08-17T13:19:11.5589296Z test_allreduce_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:19:11.5777200Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51272 2022-08-17T13:19:11.5783219Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51273 2022-08-17T13:19:11.5789945Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 51274 2022-08-17T13:19:11.5796825Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 51275 2022-08-17T13:19:12.9829478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:12.9829975Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:12.9839327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:12.9839822Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:12.9965983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:12.9966454Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:12.9977303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:12.9977803Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:13.0540186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:13.0540647Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:13.0551173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:13.0551666Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:13.1088336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:13.1088859Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:13.1098978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:13.1099724Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:13.1485574Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:19:13.1634313Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:19:13.2212820Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:19:13.2852345Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:19:15.6909355Z ok (5.594s) 2022-08-17T13:19:15.6909743Z 2022-08-17T13:19:15.6910385Z ---------------------------------------------------------------------- 2022-08-17T13:19:15.6910985Z Ran 1 test in 5.594s 2022-08-17T13:19:15.6911289Z 2022-08-17T13:19:15.6911449Z OK 2022-08-17T13:19:15.6911695Z 2022-08-17T13:19:15.6911935Z Generating XML reports... 2022-08-17T13:19:15.6949368Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131910.xml 2022-08-17T13:19:17.4624110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:17.4624615Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:17.4627113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:17.4627618Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:17.6377456Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:19:17.6393192Z 2022-08-17T13:19:17.6393530Z Running tests... 2022-08-17T13:19:17.6394222Z ---------------------------------------------------------------------- 2022-08-17T13:19:19.1317881Z test_barrier_implies_wait (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:19:19.1506049Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51483 2022-08-17T13:19:19.1512838Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51484 2022-08-17T13:19:19.1518845Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 51485 2022-08-17T13:19:19.1525353Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 51486 2022-08-17T13:19:20.5566151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:20.5566642Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:20.5575592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:20.5576096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:20.5657663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:20.5658104Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:20.5669233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:20.5669709Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:20.5684618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:20.5685054Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:20.5696105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:20.5696591Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:20.5849093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:20.5849545Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:20.5860958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:20.5861437Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:20.7228856Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:19:20.7341133Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:19:20.7365352Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:19:20.7573607Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:19:21.2590383Z ok (3.619s) 2022-08-17T13:19:21.2590812Z 2022-08-17T13:19:21.2591443Z ---------------------------------------------------------------------- 2022-08-17T13:19:21.2592034Z Ran 1 test in 3.620s 2022-08-17T13:19:21.2592739Z 2022-08-17T13:19:21.2592931Z OK 2022-08-17T13:19:21.2593164Z 2022-08-17T13:19:21.2593403Z Generating XML reports... 2022-08-17T13:19:21.2630304Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131917.xml 2022-08-17T13:19:23.0477530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:23.0478042Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:23.0480417Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:23.0480893Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:23.2234227Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:19:23.2248901Z 2022-08-17T13:19:23.2249047Z Running tests... 2022-08-17T13:19:23.2249718Z ---------------------------------------------------------------------- 2022-08-17T13:19:24.7343428Z test_broadcast_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:19:24.7544009Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51666 2022-08-17T13:19:24.7549789Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51667 2022-08-17T13:19:24.7555970Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 51668 2022-08-17T13:19:24.7562277Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 51669 2022-08-17T13:19:26.1739469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:26.1740462Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:26.1749706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:26.1750689Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:26.1751904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:26.1752790Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:26.1760084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:26.1761005Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:26.1764790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:26.1765752Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:26.1771706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:26.1772706Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:26.1956355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:26.1957196Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:26.1968269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:26.1969255Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:26.3486490Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:19:26.3521992Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:19:26.3529967Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:19:26.3684403Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:19:26.7624756Z ok (3.537s) 2022-08-17T13:19:26.7625042Z 2022-08-17T13:19:26.7625457Z ---------------------------------------------------------------------- 2022-08-17T13:19:26.7625781Z Ran 1 test in 3.537s 2022-08-17T13:19:26.7625948Z 2022-08-17T13:19:26.7626045Z OK 2022-08-17T13:19:26.7626182Z 2022-08-17T13:19:26.7626318Z Generating XML reports... 2022-08-17T13:19:26.7660579Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131923.xml 2022-08-17T13:19:28.5347573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:28.5348074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:28.5350789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:28.5351620Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:28.7137258Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:19:28.7152802Z 2022-08-17T13:19:28.7153050Z Running tests... 2022-08-17T13:19:28.7153484Z ---------------------------------------------------------------------- 2022-08-17T13:19:30.2091059Z test_broadcast_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:19:30.2279216Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51849 2022-08-17T13:19:30.2286129Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51850 2022-08-17T13:19:30.2292367Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 51851 2022-08-17T13:19:30.2298634Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 51852 2022-08-17T13:19:31.6638753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:31.6639303Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:31.6648516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:31.6649004Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:31.6759837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:31.6760306Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:31.6771670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:31.6772151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:31.7003571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:31.7004017Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:31.7014593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:31.7015074Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:31.7082190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:31.7082624Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:31.7093545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:31.7094020Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:31.8324139Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:19:31.8480752Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:19:31.8656171Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:19:31.8732492Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:19:34.0404578Z ok (5.325s) 2022-08-17T13:19:34.0404783Z 2022-08-17T13:19:34.0405193Z ---------------------------------------------------------------------- 2022-08-17T13:19:34.0405537Z Ran 1 test in 5.325s 2022-08-17T13:19:34.0405687Z 2022-08-17T13:19:34.0405784Z OK 2022-08-17T13:19:34.0405922Z 2022-08-17T13:19:34.0406060Z Generating XML reports... 2022-08-17T13:19:34.0442231Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131928.xml 2022-08-17T13:19:35.8380670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:35.8381744Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:35.8383454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:35.8383926Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:36.0145607Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:19:36.0160951Z 2022-08-17T13:19:36.0161235Z Running tests... 2022-08-17T13:19:36.0161683Z ---------------------------------------------------------------------- 2022-08-17T13:19:37.5237946Z test_broadcast_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:19:37.5434345Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52036 2022-08-17T13:19:37.5440261Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52037 2022-08-17T13:19:37.5446801Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52038 2022-08-17T13:19:37.5452833Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52039 2022-08-17T13:19:38.9499052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:38.9499565Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:38.9508486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:38.9508982Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:38.9786151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:38.9786620Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:38.9798014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:38.9798517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:39.0401931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:39.0402400Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:39.0412539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:39.0413030Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:39.0413606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:39.0414062Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:39.0424670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:39.0425141Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:39.1157556Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:19:39.1507793Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:19:39.2109429Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:19:39.2123121Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:19:39.6515283Z ok (3.635s) 2022-08-17T13:19:39.6515489Z 2022-08-17T13:19:39.6515874Z ---------------------------------------------------------------------- 2022-08-17T13:19:39.6516217Z Ran 1 test in 3.635s 2022-08-17T13:19:39.6516384Z 2022-08-17T13:19:39.6516488Z OK 2022-08-17T13:19:39.6516604Z 2022-08-17T13:19:39.6516740Z Generating XML reports... 2022-08-17T13:19:39.6552176Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131936.xml 2022-08-17T13:19:41.4269983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:41.4270473Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:41.4272878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:41.4273368Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:41.6025656Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:19:41.6040658Z 2022-08-17T13:19:41.6040855Z Running tests... 2022-08-17T13:19:41.6041300Z ---------------------------------------------------------------------- 2022-08-17T13:19:43.1248344Z test_broadcast_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:19:43.1442257Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52219 2022-08-17T13:19:43.1448449Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52220 2022-08-17T13:19:43.1454678Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52221 2022-08-17T13:19:43.1460962Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52222 2022-08-17T13:19:44.5440019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:44.5440999Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:44.5448829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:44.5449805Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:44.5625744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:44.5626715Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:44.5637658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:44.5638625Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:44.5674327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:44.5675285Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:44.5686177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:44.5687139Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:44.5864385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:44.5865424Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:44.5877601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:44.5878603Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:44.7112560Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:19:44.7315950Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:19:44.7367930Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:19:44.7543707Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:19:45.3528759Z ok (3.748s) 2022-08-17T13:19:45.3529056Z 2022-08-17T13:19:45.3529621Z ---------------------------------------------------------------------- 2022-08-17T13:19:45.3530303Z Ran 1 test in 3.749s 2022-08-17T13:19:45.3530454Z 2022-08-17T13:19:45.3530549Z OK 2022-08-17T13:19:45.3530684Z 2022-08-17T13:19:45.3530822Z Generating XML reports... 2022-08-17T13:19:45.3566197Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131941.xml 2022-08-17T13:19:47.1181987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:47.1182501Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:47.1185240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:47.1185717Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:47.2879979Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:19:47.2894975Z 2022-08-17T13:19:47.2895197Z Running tests... 2022-08-17T13:19:47.2895701Z ---------------------------------------------------------------------- 2022-08-17T13:19:48.7625563Z test_broadcast_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:19:48.7810218Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52426 2022-08-17T13:19:48.7816588Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52427 2022-08-17T13:19:48.7823452Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52428 2022-08-17T13:19:48.7829547Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52429 2022-08-17T13:19:50.2143534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:50.2144271Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:50.2153318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:50.2153825Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:50.2559146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:50.2559619Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:50.2569756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:50.2570237Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:50.2597971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:50.2598423Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:50.2609911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:50.2610399Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:50.2725293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:50.2726030Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:50.2737206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:50.2737682Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:50.3857549Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:19:50.4238500Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:19:50.4322831Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:19:50.4454301Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:19:52.8953645Z ok (5.606s) 2022-08-17T13:19:52.8953895Z 2022-08-17T13:19:52.8954301Z ---------------------------------------------------------------------- 2022-08-17T13:19:52.8954648Z Ran 1 test in 5.606s 2022-08-17T13:19:52.8954836Z 2022-08-17T13:19:52.8954933Z OK 2022-08-17T13:19:52.8955051Z 2022-08-17T13:19:52.8955187Z Generating XML reports... 2022-08-17T13:19:52.8990687Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131947.xml 2022-08-17T13:19:54.6780269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:54.6780809Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:54.6783631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:54.6784115Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:54.8547262Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:19:54.8562654Z 2022-08-17T13:19:54.8562907Z Running tests... 2022-08-17T13:19:54.8563357Z ---------------------------------------------------------------------- 2022-08-17T13:19:56.3617955Z test_empty_tensors (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:19:56.3816224Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52637 2022-08-17T13:19:56.3822228Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52638 2022-08-17T13:19:56.3828864Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52639 2022-08-17T13:19:56.3835467Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52640 2022-08-17T13:19:57.8740558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:57.8741087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:57.8749958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:57.8750446Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:57.8894183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:57.8894648Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:57.8905418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:57.8905897Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:57.9307370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:57.9307840Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:57.9318444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:57.9319195Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:57.9529743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:19:57.9530198Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:19:57.9540943Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:19:57.9541415Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:19:58.0409692Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:19:58.0551064Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:19:58.1012232Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:19:58.1204125Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:19:58.5905800Z ok (3.734s) 2022-08-17T13:19:58.5906032Z 2022-08-17T13:19:58.5906426Z ---------------------------------------------------------------------- 2022-08-17T13:19:58.5906750Z Ran 1 test in 3.734s 2022-08-17T13:19:58.5906915Z 2022-08-17T13:19:58.5907009Z OK 2022-08-17T13:19:58.5907917Z 2022-08-17T13:19:58.5908076Z Generating XML reports... 2022-08-17T13:19:58.5943098Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131954.xml 2022-08-17T13:20:00.3520475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:00.3520980Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:00.3523591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:00.3524373Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:00.5297532Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:20:00.5313140Z 2022-08-17T13:20:00.5313397Z Running tests... 2022-08-17T13:20:00.5313995Z ---------------------------------------------------------------------- 2022-08-17T13:20:02.0473012Z test_gather_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:20:02.0671296Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52820 2022-08-17T13:20:02.0677702Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52821 2022-08-17T13:20:02.0684209Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52822 2022-08-17T13:20:02.0690810Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52823 2022-08-17T13:20:03.4819386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:03.4819900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:03.4828922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:03.4829422Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:03.5690740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:03.5691217Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:03.5701215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:03.5701696Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:03.5831315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:03.5831761Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:03.5842974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:03.5843474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:03.6056827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:03.6057274Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:03.6068744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:03.6069220Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:03.6486629Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:20:03.7412189Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:20:03.7476552Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:20:03.7767618Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:20:04.2759142Z ok (3.744s) 2022-08-17T13:20:04.2759353Z 2022-08-17T13:20:04.2759872Z ---------------------------------------------------------------------- 2022-08-17T13:20:04.2760320Z Ran 1 test in 3.744s 2022-08-17T13:20:04.2760495Z 2022-08-17T13:20:04.2760573Z OK 2022-08-17T13:20:04.2760706Z 2022-08-17T13:20:04.2760844Z Generating XML reports... 2022-08-17T13:20:04.2797906Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132000.xml 2022-08-17T13:20:06.0366284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:06.0366829Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:06.0369253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:06.0369951Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:06.2138434Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:20:06.2153212Z 2022-08-17T13:20:06.2153460Z Running tests... 2022-08-17T13:20:06.2153875Z ---------------------------------------------------------------------- 2022-08-17T13:20:07.7269057Z test_gather_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:20:07.7464740Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53003 2022-08-17T13:20:07.7471299Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53004 2022-08-17T13:20:07.7477805Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53005 2022-08-17T13:20:07.7484024Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53006 2022-08-17T13:20:09.1560170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:09.1560683Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:09.1569921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:09.1570397Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:09.1905608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:09.1906087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:09.1916756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:09.1917260Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:09.2170291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:09.2170776Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:09.2181731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:09.2182214Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:09.2379329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:09.2379796Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:09.2391252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:09.2391906Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:09.3279257Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:20:09.3566383Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:20:09.3887965Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:20:09.4086462Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:20:11.4586703Z ok (5.243s) 2022-08-17T13:20:11.4587137Z 2022-08-17T13:20:11.4587885Z ---------------------------------------------------------------------- 2022-08-17T13:20:11.4588281Z Ran 1 test in 5.243s 2022-08-17T13:20:11.4588447Z 2022-08-17T13:20:11.4588539Z OK 2022-08-17T13:20:11.4588672Z 2022-08-17T13:20:11.4588806Z Generating XML reports... 2022-08-17T13:20:11.4624305Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132006.xml 2022-08-17T13:20:13.2484735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:13.2485230Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:13.2488045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:13.2488547Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:13.4253496Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:20:13.4269101Z 2022-08-17T13:20:13.4269555Z Running tests... 2022-08-17T13:20:13.4270046Z ---------------------------------------------------------------------- 2022-08-17T13:20:14.9367333Z test_gather_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:20:14.9562208Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53190 2022-08-17T13:20:14.9568555Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53191 2022-08-17T13:20:14.9575119Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53192 2022-08-17T13:20:14.9581595Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53193 2022-08-17T13:20:16.3469778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:16.3470289Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:16.3479257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:16.3479723Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:16.3724263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:16.3724744Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:16.3735942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:16.3736430Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:16.3741760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:16.3742219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:16.3753221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:16.3753681Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:16.3913033Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:16.3913733Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:16.3925158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:16.3925630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:16.5143032Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:20:16.5437223Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:20:16.5451433Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:20:16.5635165Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:20:17.0646783Z ok (3.637s) 2022-08-17T13:20:17.0646987Z 2022-08-17T13:20:17.0647377Z ---------------------------------------------------------------------- 2022-08-17T13:20:17.0647738Z Ran 1 test in 3.638s 2022-08-17T13:20:17.0647884Z 2022-08-17T13:20:17.0647978Z OK 2022-08-17T13:20:17.0648112Z 2022-08-17T13:20:17.0648245Z Generating XML reports... 2022-08-17T13:20:17.0684663Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132013.xml 2022-08-17T13:20:18.8448490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:18.8448973Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:18.8451647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:18.8452133Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:19.0205541Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:20:19.0220050Z 2022-08-17T13:20:19.0220465Z Running tests... 2022-08-17T13:20:19.0220977Z ---------------------------------------------------------------------- 2022-08-17T13:20:20.5316643Z test_gather_noncontiguous_input (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:20:20.5510422Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53373 2022-08-17T13:20:20.5516789Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53374 2022-08-17T13:20:20.5523251Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53375 2022-08-17T13:20:20.5529855Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53376 2022-08-17T13:20:22.0243572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:22.0244071Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:22.0253871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:22.0254369Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:22.0461105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:22.0461594Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:22.0472850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:22.0473334Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:22.0485581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:22.0486037Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:22.0496861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:22.0497490Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:22.0602616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:22.0603082Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:22.0613788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:22.0614274Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:22.1997763Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:20:22.2189585Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:20:22.2190324Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:20:22.2287916Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:20:22.6594547Z ok (3.637s) 2022-08-17T13:20:22.6594863Z 2022-08-17T13:20:22.6595628Z ---------------------------------------------------------------------- 2022-08-17T13:20:22.6596166Z Ran 1 test in 3.637s 2022-08-17T13:20:22.6596347Z 2022-08-17T13:20:22.6596441Z OK 2022-08-17T13:20:22.6596582Z 2022-08-17T13:20:22.6596714Z Generating XML reports... 2022-08-17T13:20:22.6631980Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132019.xml 2022-08-17T13:20:24.3897647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:24.3898157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:24.3900440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:24.3900939Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:24.5598151Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:20:24.5612330Z 2022-08-17T13:20:24.5612588Z Running tests... 2022-08-17T13:20:24.5613029Z ---------------------------------------------------------------------- 2022-08-17T13:20:26.0281249Z test_gather_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:20:26.0467487Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53556 2022-08-17T13:20:26.0474493Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53557 2022-08-17T13:20:26.0480892Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53558 2022-08-17T13:20:26.0487350Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53559 2022-08-17T13:20:27.5460034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:27.5460562Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:27.5469426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:27.5470222Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:27.5485189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:27.5485655Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:27.5488042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:27.5488508Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:27.5497129Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:27.5497619Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:27.5500197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:27.5500682Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:27.5866831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:27.5867310Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:27.5878392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:27.5878867Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:27.7173134Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:20:27.7232790Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:20:27.7286612Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:20:27.7597208Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:20:28.8573195Z ok (4.296s) 2022-08-17T13:20:28.8573458Z 2022-08-17T13:20:28.8573854Z ---------------------------------------------------------------------- 2022-08-17T13:20:28.8574191Z Ran 1 test in 4.296s 2022-08-17T13:20:28.8574361Z 2022-08-17T13:20:28.8574457Z OK 2022-08-17T13:20:28.8574575Z 2022-08-17T13:20:28.8574713Z Generating XML reports... 2022-08-17T13:20:28.8612055Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132024.xml 2022-08-17T13:20:30.6820327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:30.6820823Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:30.6823260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:30.6823771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:30.8603481Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:20:30.8618285Z 2022-08-17T13:20:30.8618713Z Running tests... 2022-08-17T13:20:30.8619217Z ---------------------------------------------------------------------- 2022-08-17T13:20:32.3683379Z test_gather_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:20:32.3877343Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53763 2022-08-17T13:20:32.3883711Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53764 2022-08-17T13:20:32.3889786Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53765 2022-08-17T13:20:32.3896064Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53766 2022-08-17T13:20:33.7996878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:33.7997670Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:33.8006876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:33.8007373Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:33.8103316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:33.8103762Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:33.8114861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:33.8115344Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:33.8384611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:33.8385066Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:33.8395760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:33.8396238Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:33.8795621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:33.8796090Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:33.8806371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:33.8806849Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:33.9710412Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:20:33.9790039Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:20:34.0047853Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:20:34.0532636Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:20:38.1044959Z ok (7.242s) 2022-08-17T13:20:38.1045232Z 2022-08-17T13:20:38.1045626Z ---------------------------------------------------------------------- 2022-08-17T13:20:38.1045967Z Ran 1 test in 7.243s 2022-08-17T13:20:38.1046132Z 2022-08-17T13:20:38.1046226Z OK 2022-08-17T13:20:38.1046367Z 2022-08-17T13:20:38.1046484Z Generating XML reports... 2022-08-17T13:20:38.1082486Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132030.xml 2022-08-17T13:20:39.8454342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:39.8454885Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:39.8457011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:39.8457485Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:40.0211247Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:20:40.0226474Z 2022-08-17T13:20:40.0226908Z Running tests... 2022-08-17T13:20:40.0227405Z ---------------------------------------------------------------------- 2022-08-17T13:20:41.5285689Z test_multi_device_constructor (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:20:41.5479851Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53974 2022-08-17T13:20:41.5485590Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53975 2022-08-17T13:20:41.5492153Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53976 2022-08-17T13:20:41.5498560Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53977 2022-08-17T13:20:42.9815796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:42.9816743Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:42.9825107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:42.9826094Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:42.9941508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:42.9942433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:42.9955200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:42.9956420Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:42.9984367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:42.9985284Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:42.9995975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:42.9996938Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:43.0223099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:43.0224320Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:43.0234656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:43.0235635Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:43.1480402Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:20:43.1614115Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:20:43.1656114Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:20:43.1877280Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:20:43.6562882Z ok (3.633s) 2022-08-17T13:20:43.6563068Z 2022-08-17T13:20:43.6563450Z ---------------------------------------------------------------------- 2022-08-17T13:20:43.6563772Z Ran 1 test in 3.634s 2022-08-17T13:20:43.6563938Z 2022-08-17T13:20:43.6564037Z OK 2022-08-17T13:20:43.6564173Z 2022-08-17T13:20:43.6564304Z Generating XML reports... 2022-08-17T13:20:43.6600763Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132040.xml 2022-08-17T13:20:45.4208039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:45.4208783Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:45.4210966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:45.4211452Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:45.5971441Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:20:45.5987048Z 2022-08-17T13:20:45.5987196Z Running tests... 2022-08-17T13:20:45.5987636Z ---------------------------------------------------------------------- 2022-08-17T13:20:47.1047215Z test_reduce_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:20:47.1235207Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54161 2022-08-17T13:20:47.1241441Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54162 2022-08-17T13:20:47.1247860Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54163 2022-08-17T13:20:47.1253780Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54164 2022-08-17T13:20:48.5372745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:48.5373694Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:48.5381470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:48.5382448Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:48.5384673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:48.5386013Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:48.5396745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:48.5397699Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:48.5965657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:48.5966617Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:48.5977693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:48.5978677Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:48.6053606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:48.6054507Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:48.6064555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:48.6066482Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:48.7089918Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:20:48.7096704Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:20:48.7633406Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:20:48.7698037Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:20:49.2318434Z ok (3.633s) 2022-08-17T13:20:49.2318636Z 2022-08-17T13:20:49.2319030Z ---------------------------------------------------------------------- 2022-08-17T13:20:49.2319394Z Ran 1 test in 3.633s 2022-08-17T13:20:49.2319564Z 2022-08-17T13:20:49.2319656Z OK 2022-08-17T13:20:49.2319789Z 2022-08-17T13:20:49.2321014Z Generating XML reports... 2022-08-17T13:20:49.2358114Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132045.xml 2022-08-17T13:20:50.9894818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:50.9895324Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:50.9897633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:50.9898122Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:51.1682198Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:20:51.1697056Z 2022-08-17T13:20:51.1697518Z Running tests... 2022-08-17T13:20:51.1698039Z ---------------------------------------------------------------------- 2022-08-17T13:20:52.6788387Z test_reduce_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:20:52.6985679Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54344 2022-08-17T13:20:52.6992208Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54345 2022-08-17T13:20:52.6998500Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54346 2022-08-17T13:20:52.7004784Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54347 2022-08-17T13:20:54.1163344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:54.1163846Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:54.1172542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:54.1173349Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:54.1188512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:54.1188993Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:54.1199620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:54.1200151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:54.1366057Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:54.1366526Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:54.1377969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:54.1378469Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:54.1947957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:54.1948472Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:54.1959431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:54.1959916Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:54.2881435Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:20:54.2887413Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:20:54.3081459Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:20:54.3620650Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:20:56.4108596Z ok (5.241s) 2022-08-17T13:20:56.4108862Z 2022-08-17T13:20:56.4109403Z ---------------------------------------------------------------------- 2022-08-17T13:20:56.4109750Z Ran 1 test in 5.241s 2022-08-17T13:20:56.4109900Z 2022-08-17T13:20:56.4110012Z OK 2022-08-17T13:20:56.4110154Z 2022-08-17T13:20:56.4110290Z Generating XML reports... 2022-08-17T13:20:56.4146879Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132051.xml 2022-08-17T13:20:58.1696165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:20:58.1696705Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:20:58.1698972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:20:58.1699456Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:20:58.3400548Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:20:58.3415372Z 2022-08-17T13:20:58.3415523Z Running tests... 2022-08-17T13:20:58.3416265Z ---------------------------------------------------------------------- 2022-08-17T13:20:59.8186479Z test_reduce_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:20:59.8373070Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54531 2022-08-17T13:20:59.8379322Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54532 2022-08-17T13:20:59.8385410Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54533 2022-08-17T13:20:59.8391758Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54534 2022-08-17T13:21:01.2540286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:01.2541689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:01.2549566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:01.2550538Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:01.2575331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:01.2576256Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:01.2586698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:01.2587621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:01.2794294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:01.2795232Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:01.2805039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:01.2806046Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:01.2821370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:01.2822328Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:01.2835353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:01.2836341Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:01.4225665Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:21:01.4273229Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:21:01.4466894Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:21:01.4575864Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:21:01.9454497Z ok (3.604s) 2022-08-17T13:21:01.9454780Z 2022-08-17T13:21:01.9455403Z ---------------------------------------------------------------------- 2022-08-17T13:21:01.9455757Z Ran 1 test in 3.604s 2022-08-17T13:21:01.9455921Z 2022-08-17T13:21:01.9456013Z OK 2022-08-17T13:21:01.9456150Z 2022-08-17T13:21:01.9456269Z Generating XML reports... 2022-08-17T13:21:01.9491209Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132058.xml 2022-08-17T13:21:03.6932529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:03.6933583Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:03.6935324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:03.6936284Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:03.8709524Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:21:03.8725261Z 2022-08-17T13:21:03.8725639Z Running tests... 2022-08-17T13:21:03.8726515Z ---------------------------------------------------------------------- 2022-08-17T13:21:05.3677597Z test_reduce_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:21:05.3864818Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54714 2022-08-17T13:21:05.3871597Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54715 2022-08-17T13:21:05.3878114Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54716 2022-08-17T13:21:05.3884146Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54717 2022-08-17T13:21:06.8653107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:06.8653599Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:06.8662165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:06.8662647Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:06.9015807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:06.9016260Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:06.9027271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:06.9027753Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:06.9123868Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:06.9124313Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:06.9134880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:06.9135354Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:06.9275635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:06.9276115Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:06.9287523Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:06.9288016Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:07.0342956Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:21:07.0712519Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:21:07.0800591Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:21:07.1004484Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:21:07.9960371Z ok (4.123s) 2022-08-17T13:21:07.9960618Z 2022-08-17T13:21:07.9961317Z ---------------------------------------------------------------------- 2022-08-17T13:21:07.9961676Z Ran 1 test in 4.123s 2022-08-17T13:21:07.9961847Z 2022-08-17T13:21:07.9961941Z OK 2022-08-17T13:21:07.9962085Z 2022-08-17T13:21:07.9962221Z Generating XML reports... 2022-08-17T13:21:07.9998608Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132103.xml 2022-08-17T13:21:09.7991868Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:09.7992409Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:09.7995065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:09.7995612Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:09.9744724Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:21:09.9759865Z 2022-08-17T13:21:09.9760009Z Running tests... 2022-08-17T13:21:09.9760451Z ---------------------------------------------------------------------- 2022-08-17T13:21:11.4717409Z test_reduce_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:21:11.4913494Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54921 2022-08-17T13:21:11.4919604Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54922 2022-08-17T13:21:11.4925939Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54923 2022-08-17T13:21:11.4932316Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54924 2022-08-17T13:21:12.9077867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:12.9078378Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:12.9087254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:12.9087731Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:12.9174418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:12.9174879Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:12.9185786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:12.9186263Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:12.9337145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:12.9337610Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:12.9349044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:12.9349506Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:12.9577179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:12.9577639Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:12.9589135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:12.9589608Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:13.0741294Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:21:13.0840697Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:21:13.1060274Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:21:13.1254710Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:21:16.0055012Z ok (6.029s) 2022-08-17T13:21:16.0055216Z 2022-08-17T13:21:16.0055716Z ---------------------------------------------------------------------- 2022-08-17T13:21:16.0056285Z Ran 1 test in 6.029s 2022-08-17T13:21:16.0056458Z 2022-08-17T13:21:16.0056535Z OK 2022-08-17T13:21:16.0056672Z 2022-08-17T13:21:16.0056809Z Generating XML reports... 2022-08-17T13:21:16.0092458Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132109.xml 2022-08-17T13:21:17.7455393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:17.7456429Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:17.7457782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:17.7458272Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:17.9147446Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:21:17.9161839Z 2022-08-17T13:21:17.9162068Z Running tests... 2022-08-17T13:21:17.9162485Z ---------------------------------------------------------------------- 2022-08-17T13:21:19.3719387Z test_round_robin (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:21:19.3906942Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55132 2022-08-17T13:21:19.3912772Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 55133 2022-08-17T13:21:19.3919083Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 55134 2022-08-17T13:21:19.3925702Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 55135 2022-08-17T13:21:20.8419414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:20.8419949Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:20.8429236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:20.8429732Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:20.8577428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:20.8577917Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:20.8588926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:20.8589411Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:20.8929469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:20.8929920Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:20.8941142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:20.8941621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:20.9200470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:20.9200962Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:20.9212371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:20.9212859Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:21.0112560Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:21:21.0253243Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:21:21.0669462Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:21:21.0922973Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:21:21.5994661Z ok (3.683s) 2022-08-17T13:21:21.5994878Z 2022-08-17T13:21:21.5995264Z ---------------------------------------------------------------------- 2022-08-17T13:21:21.5995590Z Ran 1 test in 3.683s 2022-08-17T13:21:21.5995777Z 2022-08-17T13:21:21.5995872Z OK 2022-08-17T13:21:21.5996009Z 2022-08-17T13:21:21.5996147Z Generating XML reports... 2022-08-17T13:21:21.6032038Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132117.xml 2022-08-17T13:21:23.3636575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:23.3637288Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:23.3639099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:23.3639592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:23.5402434Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:21:23.5417403Z 2022-08-17T13:21:23.5417642Z Running tests... 2022-08-17T13:21:23.5418052Z ---------------------------------------------------------------------- 2022-08-17T13:21:25.0530890Z test_round_robin_create_destroy (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:21:25.0731338Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55327 2022-08-17T13:21:25.0736906Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 55328 2022-08-17T13:21:25.0743071Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 55329 2022-08-17T13:21:25.0749652Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 55330 2022-08-17T13:21:26.4892288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:26.4892807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:26.4901789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:26.4902329Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:26.5120793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:26.5121291Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:26.5132574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:26.5133085Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:26.5335316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:26.5335803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:26.5347100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:26.5347606Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:26.5534165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:26.5534656Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:26.5545123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:26.5545629Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:26.6575152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:21:26.6853133Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:21:26.7046624Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:21:26.7191263Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:21:27.4821028Z ok (3.940s) 2022-08-17T13:21:27.4821238Z 2022-08-17T13:21:27.4821597Z ---------------------------------------------------------------------- 2022-08-17T13:21:27.4821960Z Ran 1 test in 3.940s 2022-08-17T13:21:27.4822129Z 2022-08-17T13:21:27.4822514Z OK 2022-08-17T13:21:27.4822667Z 2022-08-17T13:21:27.4822805Z Generating XML reports... 2022-08-17T13:21:27.4857705Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132123.xml 2022-08-17T13:21:29.2669435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:29.2670091Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:29.2672640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:29.2673132Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:29.4427064Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:21:29.4442510Z 2022-08-17T13:21:29.4442783Z Running tests... 2022-08-17T13:21:29.4443217Z ---------------------------------------------------------------------- 2022-08-17T13:21:30.9512841Z test_scatter_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:21:30.9700992Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55546 2022-08-17T13:21:30.9708291Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 55547 2022-08-17T13:21:30.9714507Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 55548 2022-08-17T13:21:30.9720742Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 55549 2022-08-17T13:21:32.3527606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:32.3528534Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:32.3537148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:32.3538132Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:32.3936217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:32.3937182Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:32.3947367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:32.3948338Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:32.4163998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:32.4164946Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:32.4174982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:32.4175981Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:32.4368676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:32.4369662Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:32.4381329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:32.4382335Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:32.5215288Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:21:32.5621426Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:21:32.5831262Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:21:32.6082840Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:21:33.0784073Z ok (3.634s) 2022-08-17T13:21:33.0784576Z 2022-08-17T13:21:33.0785006Z ---------------------------------------------------------------------- 2022-08-17T13:21:33.0785335Z Ran 1 test in 3.634s 2022-08-17T13:21:33.0785501Z 2022-08-17T13:21:33.0785597Z OK 2022-08-17T13:21:33.0785736Z 2022-08-17T13:21:33.0785873Z Generating XML reports... 2022-08-17T13:21:33.0820486Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132129.xml 2022-08-17T13:21:34.8717279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:34.8717831Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:34.8720082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:34.8720916Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:35.0478253Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:21:35.0492995Z 2022-08-17T13:21:35.0493185Z Running tests... 2022-08-17T13:21:35.0493616Z ---------------------------------------------------------------------- 2022-08-17T13:21:36.5530785Z test_scatter_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:21:36.5726695Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55729 2022-08-17T13:21:36.5732852Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 55730 2022-08-17T13:21:36.5739083Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 55731 2022-08-17T13:21:36.5745549Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 55732 2022-08-17T13:21:37.9924957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:37.9925913Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:37.9934407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:37.9935361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:38.0047567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:38.0048492Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:38.0060035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:38.0061004Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:38.0086470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:38.0087406Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:38.0099041Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:38.0100029Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:38.0223915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:38.0225128Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:38.0236133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:38.0237125Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:38.1598182Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:21:38.1724302Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:21:38.1789065Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:21:38.1947379Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:21:40.3850433Z ok (5.335s) 2022-08-17T13:21:40.3850635Z 2022-08-17T13:21:40.3851024Z ---------------------------------------------------------------------- 2022-08-17T13:21:40.3851387Z Ran 1 test in 5.336s 2022-08-17T13:21:40.3851556Z 2022-08-17T13:21:40.3851632Z OK 2022-08-17T13:21:40.3851767Z 2022-08-17T13:21:40.3851907Z Generating XML reports... 2022-08-17T13:21:40.3887728Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132135.xml 2022-08-17T13:21:42.1521221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:42.1522034Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:42.1523620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:42.1524089Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:42.3300674Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:21:42.3315677Z 2022-08-17T13:21:42.3316189Z Running tests... 2022-08-17T13:21:42.3316670Z ---------------------------------------------------------------------- 2022-08-17T13:21:43.8480546Z test_scatter_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:21:43.8669403Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55916 2022-08-17T13:21:43.8676295Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 55917 2022-08-17T13:21:43.8682324Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 55918 2022-08-17T13:21:43.8688654Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 55919 2022-08-17T13:21:45.3539983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:45.3540509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:45.3550786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:45.3551284Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:45.4029845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:45.4030322Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:45.4040649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:45.4041136Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:45.4053387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:45.4053842Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:45.4065105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:45.4065563Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:45.4443778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:45.4444247Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:45.4453943Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:45.4454420Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:45.5217572Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:21:45.5720363Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:21:45.5753814Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:21:45.6112037Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:21:46.0754638Z ok (3.744s) 2022-08-17T13:21:46.0754844Z 2022-08-17T13:21:46.0755245Z ---------------------------------------------------------------------- 2022-08-17T13:21:46.0755568Z Ran 1 test in 3.744s 2022-08-17T13:21:46.0755731Z 2022-08-17T13:21:46.0755825Z OK 2022-08-17T13:21:46.0755962Z 2022-08-17T13:21:46.0756098Z Generating XML reports... 2022-08-17T13:21:46.0793232Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132142.xml 2022-08-17T13:21:47.8332744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:47.8333264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:47.8335654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:47.8336146Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:48.0017135Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:21:48.0032320Z 2022-08-17T13:21:48.0032583Z Running tests... 2022-08-17T13:21:48.0033002Z ---------------------------------------------------------------------- 2022-08-17T13:21:49.4650566Z test_scatter_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:21:49.4839488Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56099 2022-08-17T13:21:49.4845652Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56100 2022-08-17T13:21:49.4851622Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 56101 2022-08-17T13:21:49.4857767Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 56102 2022-08-17T13:21:50.9142843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:50.9144119Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:50.9153331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:50.9154308Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:50.9183569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:50.9184777Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:50.9194824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:50.9195780Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:50.9240027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:50.9240898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:50.9251519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:50.9252454Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:50.9380407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:50.9381337Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:50.9393605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:50.9394981Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:51.0809499Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:21:51.0857837Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:21:51.0932179Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:21:51.1114311Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:21:52.2937311Z ok (4.290s) 2022-08-17T13:21:52.2937519Z 2022-08-17T13:21:52.2938135Z ---------------------------------------------------------------------- 2022-08-17T13:21:52.2938487Z Ran 1 test in 4.290s 2022-08-17T13:21:52.2938991Z 2022-08-17T13:21:52.2939087Z OK 2022-08-17T13:21:52.2939226Z 2022-08-17T13:21:52.2939364Z Generating XML reports... 2022-08-17T13:21:52.2974656Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132147.xml 2022-08-17T13:21:54.0740663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:54.0741164Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:54.0743280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:54.0743768Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:54.2502047Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:21:54.2518182Z 2022-08-17T13:21:54.2518505Z Running tests... 2022-08-17T13:21:54.2518935Z ---------------------------------------------------------------------- 2022-08-17T13:21:54.2525630Z test_scatter_stress_cuda (__main__.ProcessGroupGlooTest) ... skip: Test is flaky, see https://github.com/pytorch/pytorch/issues/15963 (0.001s) 2022-08-17T13:21:54.2525980Z 2022-08-17T13:21:54.2526265Z ---------------------------------------------------------------------- 2022-08-17T13:21:54.2526573Z Ran 1 test in 0.001s 2022-08-17T13:21:54.2526738Z 2022-08-17T13:21:54.2526848Z OK (skipped=1) 2022-08-17T13:21:54.2527001Z 2022-08-17T13:21:54.2527126Z Generating XML reports... 2022-08-17T13:21:54.2559762Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132154.xml 2022-08-17T13:21:55.9084728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:55.9085230Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:55.9087710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:55.9088194Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:56.0856999Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:21:56.0872750Z 2022-08-17T13:21:56.0873129Z Running tests... 2022-08-17T13:21:56.0873630Z ---------------------------------------------------------------------- 2022-08-17T13:21:57.5942512Z test_send_recv_all_to_all (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:21:57.6137990Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56339 2022-08-17T13:21:57.6144185Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56340 2022-08-17T13:21:57.6151436Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 56341 2022-08-17T13:21:57.6157552Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 56342 2022-08-17T13:21:59.0273439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:59.0273933Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:59.0283542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:59.0284038Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:59.0552511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:59.0552978Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:59.0563770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:59.0564232Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:59.0887680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:59.0888157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:59.0899118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:59.0899580Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:59.0983790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:21:59.0984257Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:21:59.0995821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:21:59.0996277Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:21:59.1949724Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:21:59.2231140Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:21:59.2550600Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:21:59.2653441Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:21:59.7220203Z ok (3.634s) 2022-08-17T13:21:59.7220407Z 2022-08-17T13:21:59.7220807Z ---------------------------------------------------------------------- 2022-08-17T13:21:59.7221170Z Ran 1 test in 3.635s 2022-08-17T13:21:59.7221337Z 2022-08-17T13:21:59.7221433Z OK 2022-08-17T13:21:59.7221552Z 2022-08-17T13:21:59.7221686Z Generating XML reports... 2022-08-17T13:21:59.7259193Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132156.xml 2022-08-17T13:22:01.5158092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:01.5158625Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:01.5161265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:01.5161776Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:01.6936897Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:22:01.6951915Z 2022-08-17T13:22:01.6952116Z Running tests... 2022-08-17T13:22:01.6952766Z ---------------------------------------------------------------------- 2022-08-17T13:22:01.6957824Z test_sparse_allreduce_basics (__main__.ProcessGroupGlooTest) ... skip: intermittent failures on Windows, in CI (0.000s) 2022-08-17T13:22:01.6958369Z 2022-08-17T13:22:01.6958819Z ---------------------------------------------------------------------- 2022-08-17T13:22:01.6959159Z Ran 1 test in 0.001s 2022-08-17T13:22:01.6959341Z 2022-08-17T13:22:01.6959452Z OK (skipped=1) 2022-08-17T13:22:01.6959606Z 2022-08-17T13:22:01.6959732Z Generating XML reports... 2022-08-17T13:22:01.6993936Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132201.xml 2022-08-17T13:22:03.3631263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:03.3631786Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:03.3634164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:03.3634648Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:03.5386682Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:22:03.5401862Z 2022-08-17T13:22:03.5402004Z Running tests... 2022-08-17T13:22:03.5402794Z ---------------------------------------------------------------------- 2022-08-17T13:22:05.0493365Z test_sparse_allreduce_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:22:05.0688650Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56555 2022-08-17T13:22:05.0695325Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56556 2022-08-17T13:22:05.0701955Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 56557 2022-08-17T13:22:05.0708985Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 56558 2022-08-17T13:22:06.5196899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:06.5197440Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:06.5206064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:06.5206571Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:06.5413585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:06.5414059Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:06.5424864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:06.5425342Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:06.5425904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:06.5426352Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:06.5437030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:06.5437501Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:06.5702595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:06.5703087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:06.5714800Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:06.5715279Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:06.6868580Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:22:06.7147106Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:22:06.7158864Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:22:06.7419295Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:22:08.9817638Z ok (5.441s) 2022-08-17T13:22:08.9817898Z 2022-08-17T13:22:08.9818606Z ---------------------------------------------------------------------- 2022-08-17T13:22:08.9819276Z Ran 1 test in 5.441s 2022-08-17T13:22:08.9819463Z 2022-08-17T13:22:08.9819711Z OK 2022-08-17T13:22:08.9819870Z 2022-08-17T13:22:08.9820008Z Generating XML reports... 2022-08-17T13:22:08.9859863Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132203.xml 2022-08-17T13:22:10.7701441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:10.7701948Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:10.7704506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:10.7705003Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:10.9461304Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:22:10.9476841Z 2022-08-17T13:22:10.9477132Z Running tests... 2022-08-17T13:22:10.9477599Z ---------------------------------------------------------------------- 2022-08-17T13:22:12.4554461Z test_sparse_allreduce_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:22:12.4751142Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56922 2022-08-17T13:22:12.4757758Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56923 2022-08-17T13:22:12.4763851Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 56924 2022-08-17T13:22:12.4770545Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 56925 2022-08-17T13:22:13.8861624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:13.8862597Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:13.8871152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:13.8872146Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:13.8889935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:13.8890879Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:13.8902356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:13.8903606Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:13.9638494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:13.9639540Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:13.9650086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:13.9651091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:13.9789989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:13.9790984Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:13.9802461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:13.9803428Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:14.0570519Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:22:14.0594142Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:22:14.1364264Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:22:14.1516284Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:22:14.5835477Z ok (3.636s) 2022-08-17T13:22:14.5835976Z 2022-08-17T13:22:14.5836409Z ---------------------------------------------------------------------- 2022-08-17T13:22:14.5836789Z Ran 1 test in 3.636s 2022-08-17T13:22:14.5836964Z 2022-08-17T13:22:14.5837041Z OK 2022-08-17T13:22:14.5837181Z 2022-08-17T13:22:14.5837317Z Generating XML reports... 2022-08-17T13:22:14.5873001Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132210.xml 2022-08-17T13:22:16.3406067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:16.3406570Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:16.3409285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:16.3410087Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:16.5167454Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:22:16.5182715Z 2022-08-17T13:22:16.5183079Z Running tests... 2022-08-17T13:22:16.5183579Z ---------------------------------------------------------------------- 2022-08-17T13:22:16.5251728Z test_forward_backward (__main__.ReducerTest) ... ok (0.007s) 2022-08-17T13:22:16.5302487Z 2022-08-17T13:22:16.5303563Z ---------------------------------------------------------------------- 2022-08-17T13:22:16.5303965Z Ran 1 test in 0.012s 2022-08-17T13:22:16.5304136Z 2022-08-17T13:22:16.5304244Z OK 2022-08-17T13:22:16.5304363Z 2022-08-17T13:22:16.5304494Z Generating XML reports... 2022-08-17T13:22:16.5336434Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20220817132216.xml 2022-08-17T13:22:18.1481731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:18.1482239Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:18.1485191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:18.1485662Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:18.3277861Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:22:18.3293258Z 2022-08-17T13:22:18.3293654Z Running tests... 2022-08-17T13:22:18.3294212Z ---------------------------------------------------------------------- 2022-08-17T13:22:18.3380482Z test_forward_backward_optimizer (__main__.ReducerTest) ... [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:22:18.3400703Z ok (0.011s) 2022-08-17T13:22:18.3415446Z 2022-08-17T13:22:18.3415968Z ---------------------------------------------------------------------- 2022-08-17T13:22:18.3416332Z Ran 1 test in 0.012s 2022-08-17T13:22:18.3416500Z 2022-08-17T13:22:18.3416603Z OK 2022-08-17T13:22:18.3416722Z 2022-08-17T13:22:18.3416854Z Generating XML reports... 2022-08-17T13:22:18.3449003Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20220817132218.xml 2022-08-17T13:22:19.9788926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:19.9789450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:19.9791955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:19.9792473Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:20.1541136Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:22:20.1556144Z 2022-08-17T13:22:20.1556316Z Running tests... 2022-08-17T13:22:20.1556762Z ---------------------------------------------------------------------- 2022-08-17T13:22:20.1627679Z test_forward_backward_unused_parameters (__main__.ReducerTest) ... ok (0.007s) 2022-08-17T13:22:20.1676793Z 2022-08-17T13:22:20.1677199Z ---------------------------------------------------------------------- 2022-08-17T13:22:20.1677522Z Ran 1 test in 0.012s 2022-08-17T13:22:20.1677693Z 2022-08-17T13:22:20.1677786Z OK 2022-08-17T13:22:20.1678173Z 2022-08-17T13:22:20.1678312Z Generating XML reports... 2022-08-17T13:22:20.1711365Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20220817132220.xml 2022-08-17T13:22:21.8142871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:21.8143620Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:21.8145692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:21.8146162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:21.9932900Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:22:21.9948627Z 2022-08-17T13:22:21.9948904Z Running tests... 2022-08-17T13:22:21.9949338Z ---------------------------------------------------------------------- 2022-08-17T13:22:21.9986631Z test_multi_dtype_multi_bucket (__main__.ReducerTest) ... ok (0.004s) 2022-08-17T13:22:22.0067117Z 2022-08-17T13:22:22.0067523Z ---------------------------------------------------------------------- 2022-08-17T13:22:22.0067858Z Ran 1 test in 0.012s 2022-08-17T13:22:22.0068032Z 2022-08-17T13:22:22.0068127Z OK 2022-08-17T13:22:22.0068264Z 2022-08-17T13:22:22.0068374Z Generating XML reports... 2022-08-17T13:22:22.0100401Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20220817132221.xml 2022-08-17T13:22:23.6611391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:23.6611898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:23.6614081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:23.6614809Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:23.8380823Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:22:23.8396730Z 2022-08-17T13:22:23.8397142Z Running tests... 2022-08-17T13:22:23.8397647Z ---------------------------------------------------------------------- 2022-08-17T13:22:23.8460980Z test_multi_dtype_single_bucket (__main__.ReducerTest) ... ok (0.006s) 2022-08-17T13:22:23.8515465Z 2022-08-17T13:22:23.8515863Z ---------------------------------------------------------------------- 2022-08-17T13:22:23.8516203Z Ran 1 test in 0.012s 2022-08-17T13:22:23.8516376Z 2022-08-17T13:22:23.8516470Z OK 2022-08-17T13:22:23.8516605Z 2022-08-17T13:22:23.8516716Z Generating XML reports... 2022-08-17T13:22:23.8548940Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20220817132223.xml 2022-08-17T13:22:25.4660863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:25.4661388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:25.4663711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:25.4664457Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:25.6369000Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:22:25.6383507Z 2022-08-17T13:22:25.6383778Z Running tests... 2022-08-17T13:22:25.6384187Z ---------------------------------------------------------------------- 2022-08-17T13:22:25.6417650Z test_single_dtype_single_bucket (__main__.ReducerTest) ... ok (0.003s) 2022-08-17T13:22:25.6503137Z 2022-08-17T13:22:25.6503577Z ---------------------------------------------------------------------- 2022-08-17T13:22:25.6503914Z Ran 1 test in 0.012s 2022-08-17T13:22:25.6504086Z 2022-08-17T13:22:25.6504160Z OK 2022-08-17T13:22:25.6504296Z 2022-08-17T13:22:25.6504601Z Generating XML reports... 2022-08-17T13:22:25.6535641Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20220817132225.xml 2022-08-17T13:22:27.2815930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:27.2816439Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:27.2818579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:27.2819063Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:27.4581604Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:22:27.4596497Z 2022-08-17T13:22:27.4596846Z Running tests... 2022-08-17T13:22:27.4597286Z ---------------------------------------------------------------------- 2022-08-17T13:22:28.9836523Z test_logging_init (__main__.RendezvousEnvTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:22:29.0019375Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:22:29.0020236Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:22:29.0118766Z ok (1.552s) 2022-08-17T13:22:29.0119643Z 2022-08-17T13:22:29.0119952Z ---------------------------------------------------------------------- 2022-08-17T13:22:29.0120294Z Ran 1 test in 1.552s 2022-08-17T13:22:29.0120464Z 2022-08-17T13:22:29.0120559Z OK 2022-08-17T13:22:29.0120694Z 2022-08-17T13:22:29.0120803Z Generating XML reports... 2022-08-17T13:22:29.0154366Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-RendezvousEnvTest-20220817132227.xml 2022-08-17T13:22:30.7447797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:30.7448815Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:30.7450932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:30.7451868Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:30.9213898Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-08-17T13:22:30.9230509Z 2022-08-17T13:22:30.9230930Z Running tests... 2022-08-17T13:22:30.9231426Z ---------------------------------------------------------------------- 2022-08-17T13:22:32.4249973Z test_default_store_timeout_gloo (__main__.TimeoutTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:22:32.4419163Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/74714 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.519s) 2022-08-17T13:22:32.4420385Z 2022-08-17T13:22:32.4420936Z ---------------------------------------------------------------------- 2022-08-17T13:22:32.4421589Z Ran 1 test in 1.519s 2022-08-17T13:22:32.4421899Z 2022-08-17T13:22:32.4422390Z OK (skipped=1) 2022-08-17T13:22:32.4422718Z 2022-08-17T13:22:32.4422953Z Generating XML reports... 2022-08-17T13:22:32.4456142Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-TimeoutTest-20220817132230.xml 2022-08-17T13:22:33.0133587Z Running distributed/fsdp/test_fsdp_core ... [2022-08-17 13:22:33.012913] 2022-08-17T13:22:33.0134310Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_core.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:22:33.012993] 2022-08-17T13:22:34.5918577Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_core 2022-08-17T13:22:34.5943029Z 2022-08-17T13:22:34.5943285Z Running tests... 2022-08-17T13:22:34.5943966Z ---------------------------------------------------------------------- 2022-08-17T13:22:34.5951452Z test_pre_backward_hook_registration_after_state_dict (__main__.TestHooks) 2022-08-17T13:22:36.0730676Z Tests that FSDP pre-backward hooks are registered on forward pass ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:22:36.0907376Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57445 2022-08-17T13:22:36.0913867Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57446 2022-08-17T13:22:37.5011146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:37.5011648Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:37.5013898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:37.5014372Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:37.5304406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:37.5304875Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:37.5309106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:37.5309567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:37.6685811Z dist init r=1, world=2 2022-08-17T13:22:37.6689786Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:22:37.7039812Z dist init r=0, world=2 2022-08-17T13:22:37.7044394Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:22:37.7045572Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:22:37.7098243Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:22:39.0710680Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:22:39.0711215Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:22:39.1016758Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:22:39.1017544Z warnings.warn( 2022-08-17T13:22:39.1018934Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:22:39.1019712Z warnings.warn( 2022-08-17T13:22:39.1118553Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:22:39.1119120Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:22:39.1119798Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:22:39.1120340Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:22:40.4020499Z ok (5.807s) 2022-08-17T13:22:40.4025400Z test_pre_backward_hook_registration_cuda_first_False (__main__.TestHooks) 2022-08-17T13:22:40.4039590Z Tests that FSDP pre-backward hooks are registered on forward pass ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57528 2022-08-17T13:22:40.4045772Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57529 2022-08-17T13:22:41.8861640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:41.8862151Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:41.8864553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:41.8865030Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:41.8927355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:41.8927835Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:41.8931737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:41.8932201Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:42.0528279Z dist init r=1, world=2 2022-08-17T13:22:42.0532008Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:22:42.0613802Z dist init r=0, world=2 2022-08-17T13:22:42.0618145Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:22:42.0619132Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:22:42.0635229Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:22:43.4482484Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:22:43.4778025Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:22:43.4779398Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:22:43.4780186Z warnings.warn( 2022-08-17T13:22:43.4781302Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:22:43.4782044Z warnings.warn( 2022-08-17T13:22:43.4881164Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:22:43.4881757Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:22:43.4882458Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:22:43.4882977Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:22:44.7153031Z ok (4.313s) 2022-08-17T13:22:44.7156821Z test_pre_backward_hook_registration_cuda_first_True (__main__.TestHooks) 2022-08-17T13:22:44.7170193Z Tests that FSDP pre-backward hooks are registered on forward pass ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57611 2022-08-17T13:22:44.7176075Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57612 2022-08-17T13:22:46.1625475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:46.1625984Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:46.1628490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:46.1628961Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:46.1870105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:46.1870566Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:46.1874548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:46.1875028Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:46.3294641Z dist init r=1, world=2 2022-08-17T13:22:46.3298888Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:22:46.3594924Z dist init r=0, world=2 2022-08-17T13:22:46.3599447Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:22:46.3600257Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:22:46.3605204Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:22:47.7489237Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:22:47.7489793Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:22:47.7901643Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:22:47.7902220Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:22:47.7933877Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:22:47.7934432Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:22:49.0281134Z ok (4.313s) 2022-08-17T13:22:49.0289766Z test_register_functions_called_cuda_first_False_mixed_precision_False (__main__.TestHooks) 2022-08-17T13:22:49.0302935Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57694 2022-08-17T13:22:49.0309794Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57695 2022-08-17T13:22:50.4960432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:50.4960960Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:50.4962841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:50.4963327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:50.5110882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:50.5111345Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:50.5114981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:50.5115635Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:50.6697364Z dist init r=1, world=2 2022-08-17T13:22:50.6701059Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:22:50.6790791Z dist init r=0, world=2 2022-08-17T13:22:50.6795113Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:22:50.6796038Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:22:50.6804369Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:22:52.0520887Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:22:52.0521420Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:22:52.0819796Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:22:52.0820589Z warnings.warn( 2022-08-17T13:22:52.0857197Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:22:52.0857955Z warnings.warn( 2022-08-17T13:22:52.0923868Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:22:52.0924447Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:22:52.0965305Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:22:52.0965858Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:22:53.2412393Z ok (4.213s) 2022-08-17T13:22:53.2420613Z test_register_functions_called_cuda_first_False_mixed_precision_True (__main__.TestHooks) 2022-08-17T13:22:53.2433728Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57773 2022-08-17T13:22:53.2440127Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57774 2022-08-17T13:22:54.6812698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:54.6813225Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:54.6815419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:54.6815920Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:54.7094856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:54.7095327Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:54.7099262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:54.7099733Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:54.8484730Z dist init r=1, world=2 2022-08-17T13:22:54.8488836Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:22:54.8847766Z dist init r=0, world=2 2022-08-17T13:22:54.8852431Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:22:54.8853276Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:22:54.8897831Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:22:56.2547779Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:22:56.2548661Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:22:56.2892538Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:727: UserWarning: Mixed precision was specified for FSDP module with batchnorm submodules wrapped via ``auto_wrap_policy``. BatchNorm units will be wrapped as a separate FSDP unit, with mixed_precision disabled (i.e. set to ``None``) as several BatchNorm kernels would raise errors when operating on reduced precision inputs. 2022-08-17T13:22:56.2894216Z warnings.warn( 2022-08-17T13:22:56.2896522Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:727: UserWarning: Mixed precision was specified for FSDP module with batchnorm submodules wrapped via ``auto_wrap_policy``. BatchNorm units will be wrapped as a separate FSDP unit, with mixed_precision disabled (i.e. set to ``None``) as several BatchNorm kernels would raise errors when operating on reduced precision inputs. 2022-08-17T13:22:56.2898045Z warnings.warn( 2022-08-17T13:22:56.2904819Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:22:56.2906280Z warnings.warn( 2022-08-17T13:22:56.2908396Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:22:56.2909686Z warnings.warn( 2022-08-17T13:22:56.3019335Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:22:56.3020511Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:22:56.3022929Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:22:56.3024238Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:22:57.4543266Z ok (4.213s) 2022-08-17T13:22:57.4552328Z test_register_functions_called_cuda_first_True_mixed_precision_False (__main__.TestHooks) 2022-08-17T13:22:57.4565616Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57852 2022-08-17T13:22:57.4571603Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57853 2022-08-17T13:22:58.8998493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:58.8998972Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:58.9001306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:58.9001800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:58.9079872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:22:58.9080324Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:22:58.9084645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:22:58.9085098Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:22:59.0665717Z dist init r=1, world=2 2022-08-17T13:22:59.0669817Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:22:59.0764452Z dist init r=0, world=2 2022-08-17T13:22:59.0768965Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:22:59.0769982Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:22:59.0772942Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:23:00.4371280Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:23:00.4371786Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:23:00.4959434Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:00.4960588Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:00.4962006Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:00.4963110Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:01.6677336Z ok (4.213s) 2022-08-17T13:23:01.6685511Z test_register_functions_called_cuda_first_True_mixed_precision_True (__main__.TestHooks) 2022-08-17T13:23:01.6698919Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57931 2022-08-17T13:23:01.6705075Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57932 2022-08-17T13:23:03.1457530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:23:03.1458019Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:23:03.1460501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:23:03.1461033Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:23:03.1600707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:23:03.1601188Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:23:03.1604619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:23:03.1605171Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:23:03.3199628Z dist init r=0, world=2 2022-08-17T13:23:03.3203875Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:23:03.3261459Z dist init r=1, world=2 2022-08-17T13:23:03.3266298Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:23:03.3267379Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:23:03.3307637Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:23:04.7181372Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:23:04.7181918Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:23:04.7516072Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:727: UserWarning: Mixed precision was specified for FSDP module with batchnorm submodules wrapped via ``auto_wrap_policy``. BatchNorm units will be wrapped as a separate FSDP unit, with mixed_precision disabled (i.e. set to ``None``) as several BatchNorm kernels would raise errors when operating on reduced precision inputs. 2022-08-17T13:23:04.7516860Z warnings.warn( 2022-08-17T13:23:04.7518322Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:727: UserWarning: Mixed precision was specified for FSDP module with batchnorm submodules wrapped via ``auto_wrap_policy``. BatchNorm units will be wrapped as a separate FSDP unit, with mixed_precision disabled (i.e. set to ``None``) as several BatchNorm kernels would raise errors when operating on reduced precision inputs. 2022-08-17T13:23:04.7519115Z warnings.warn( 2022-08-17T13:23:04.7639300Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:04.7639865Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:04.7642612Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:04.7643153Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:05.9810114Z ok (4.313s) 2022-08-17T13:23:05.9819061Z test_transformer_no_grad_mixed_precision_False (__main__.TestNoGrad) 2022-08-17T13:23:05.9832826Z Tests that for an FSDP-wrapped transformer model with shared ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58010 2022-08-17T13:23:05.9838815Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58011 2022-08-17T13:23:07.4369566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:23:07.4370068Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:23:07.4372355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:23:07.4372872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:23:07.4781618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:23:07.4782111Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:23:07.4786511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:23:07.4787014Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:23:07.6057145Z dist init r=1, world=2 2022-08-17T13:23:07.6060835Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:23:07.6504619Z dist init r=0, world=2 2022-08-17T13:23:07.6509996Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:23:07.6510999Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:23:07.6570956Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:23:09.0043590Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:23:09.0044134Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:23:09.0380313Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:23:09.0381099Z warnings.warn( 2022-08-17T13:23:09.0382209Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:23:09.0382967Z warnings.warn( 2022-08-17T13:23:09.0483290Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:09.0483858Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:09.0485716Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:09.0486270Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:10.2944193Z ok (4.313s) 2022-08-17T13:23:10.2953012Z test_transformer_no_grad_mixed_precision_True (__main__.TestNoGrad) 2022-08-17T13:23:10.2966082Z Tests that for an FSDP-wrapped transformer model with shared ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58093 2022-08-17T13:23:10.2971803Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58094 2022-08-17T13:23:11.7610913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:23:11.7611859Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:23:11.7613384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:23:11.7614305Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:23:11.8032014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:23:11.8032934Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:23:11.8036472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:23:11.8037710Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:23:11.9281211Z dist init r=1, world=2 2022-08-17T13:23:11.9285673Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:23:11.9758614Z dist init r=0, world=2 2022-08-17T13:23:11.9763560Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:23:11.9764528Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:23:11.9796284Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:23:13.3488058Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:23:13.3489500Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:23:13.3770059Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:727: UserWarning: Mixed precision was specified for FSDP module with batchnorm submodules wrapped via ``auto_wrap_policy``. BatchNorm units will be wrapped as a separate FSDP unit, with mixed_precision disabled (i.e. set to ``None``) as several BatchNorm kernels would raise errors when operating on reduced precision inputs. 2022-08-17T13:23:13.3771765Z warnings.warn( 2022-08-17T13:23:13.3774045Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:727: UserWarning: Mixed precision was specified for FSDP module with batchnorm submodules wrapped via ``auto_wrap_policy``. BatchNorm units will be wrapped as a separate FSDP unit, with mixed_precision disabled (i.e. set to ``None``) as several BatchNorm kernels would raise errors when operating on reduced precision inputs. 2022-08-17T13:23:13.3775559Z warnings.warn( 2022-08-17T13:23:13.3785930Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:23:13.3787500Z warnings.warn( 2022-08-17T13:23:13.3789701Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:23:13.3791153Z warnings.warn( 2022-08-17T13:23:13.3910816Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:13.3911858Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:13.3912894Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:13.3913890Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:14.6075799Z ok (4.313s) 2022-08-17T13:23:14.6085498Z test_param_change_after_init_mixed_precision_False (__main__.TestParamInit) 2022-08-17T13:23:14.6100011Z Tests that changing FSDP model parameter values in-place after FSDP ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58176 2022-08-17T13:23:14.6106017Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58177 2022-08-17T13:23:16.0758066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:23:16.0758603Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:23:16.0760979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:23:16.0761470Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:23:16.0814557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:23:16.0815021Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:23:16.0818986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:23:16.0819639Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:23:16.2474329Z dist init r=1, world=2 2022-08-17T13:23:16.2477959Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:23:16.2499500Z dist init r=0, world=2 2022-08-17T13:23:16.2504241Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:23:16.2505733Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:23:16.2581542Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:23:17.6296481Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:23:17.6297008Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:23:17.6581301Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:23:17.6582095Z warnings.warn( 2022-08-17T13:23:17.6586812Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:23:17.6587558Z warnings.warn( 2022-08-17T13:23:17.6681913Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:17.6682480Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:17.6689666Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:17.6690209Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:18.8211738Z ok (4.213s) 2022-08-17T13:23:18.8220154Z test_param_change_after_init_mixed_precision_True (__main__.TestParamInit) 2022-08-17T13:23:18.8233780Z Tests that changing FSDP model parameter values in-place after FSDP ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58255 2022-08-17T13:23:18.8239979Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58256 2022-08-17T13:23:20.2472621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:23:20.2473147Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:23:20.2475607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:23:20.2476150Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:23:20.2710636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:23:20.2711099Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:23:20.2715121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:23:20.2715602Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:23:20.4140144Z dist init r=1, world=2 2022-08-17T13:23:20.4143903Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:23:20.4438956Z dist init r=0, world=2 2022-08-17T13:23:20.4444162Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:23:20.4444934Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:23:20.4450374Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:23:21.8342630Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:23:21.8343168Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:23:21.8651156Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:727: UserWarning: Mixed precision was specified for FSDP module with batchnorm submodules wrapped via ``auto_wrap_policy``. BatchNorm units will be wrapped as a separate FSDP unit, with mixed_precision disabled (i.e. set to ``None``) as several BatchNorm kernels would raise errors when operating on reduced precision inputs. 2022-08-17T13:23:21.8651941Z warnings.warn( 2022-08-17T13:23:21.8664053Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:23:21.8664817Z warnings.warn( 2022-08-17T13:23:21.8688706Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:727: UserWarning: Mixed precision was specified for FSDP module with batchnorm submodules wrapped via ``auto_wrap_policy``. BatchNorm units will be wrapped as a separate FSDP unit, with mixed_precision disabled (i.e. set to ``None``) as several BatchNorm kernels would raise errors when operating on reduced precision inputs. 2022-08-17T13:23:21.8689467Z warnings.warn( 2022-08-17T13:23:21.8703112Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:23:21.8704060Z warnings.warn( 2022-08-17T13:23:21.8771518Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:21.8772078Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:21.8814806Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:21.8815384Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:23.0344942Z ok (4.213s) 2022-08-17T13:23:23.0350707Z test_delayed_optim_step_offload_false_no_shard (__main__.TestParityWithDDP) 2022-08-17T13:23:23.0363696Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58334 2022-08-17T13:23:23.0369586Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58335 2022-08-17T13:23:24.4902440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:23:24.4902996Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:23:24.4905958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:23:24.4906455Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:23:24.5073731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:23:24.5074200Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:23:24.5078366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:23:24.5078842Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:23:24.6638645Z dist init r=1, world=2 2022-08-17T13:23:24.6642922Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:23:24.6755599Z dist init r=0, world=2 2022-08-17T13:23:24.6760158Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:23:24.6761285Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:23:24.6848403Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:23:26.0420892Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:23:26.0421427Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:23:26.0707646Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:26.0708235Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:26.0708930Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:26.0709504Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:26.7404487Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:26.7405036Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:26.7436461Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:23:26.7437242Z warnings.warn( 2022-08-17T13:23:26.7438631Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:23:26.7439393Z warnings.warn( 2022-08-17T13:23:27.3017421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:23:27.3018141Z warnings.warn(msg, FutureWarning) 2022-08-17T13:23:27.3021303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:23:27.3022284Z warnings.warn(msg, FutureWarning) 2022-08-17T13:23:27.4481894Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:27.4482403Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:28.0432350Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:28.0432874Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:28.6393982Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:28.6394509Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:29.2345334Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:29.2345853Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:29.8304971Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:29.8305490Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:30.4260478Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:30.4260987Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:31.0247239Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:31.0247755Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:31.6228008Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:31.6228542Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:32.2212347Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:32.2212868Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:32.8191564Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:32.8192099Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:33.1160304Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1161933Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1164016Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1166076Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1167356Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1168774Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1170487Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1171760Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1173022Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1174310Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1175576Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1176828Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1178071Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1179385Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1180643Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1181913Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1183197Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1184906Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1186164Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1187429Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1188681Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1189928Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1191180Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1192424Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1193671Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1195008Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1196248Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1197491Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1198810Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1200054Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1201292Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1202542Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1203779Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1205020Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1206268Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1207562Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1209310Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1210647Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1211903Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1213394Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1214721Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1215972Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1217224Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1218473Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1219717Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1220951Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1222198Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1223695Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1225035Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1226411Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1227777Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1229208Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1230570Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.1231924Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.4195041Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:33.4195566Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:33.7563221Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.7564552Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.7565823Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.7595065Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.7596331Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:33.7597876Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:34.3614721Z ok (11.327s) 2022-08-17T13:23:34.3620556Z test_delayed_optim_step_offload_false_none (__main__.TestParityWithDDP) 2022-08-17T13:23:34.3634836Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58417 2022-08-17T13:23:34.3641205Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58418 2022-08-17T13:23:35.8042370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:23:35.8042877Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:23:35.8045306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:23:35.8045807Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:23:35.8246912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:23:35.8247373Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:23:35.8251431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:23:35.8251913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:23:35.9763603Z dist init r=0, world=2 2022-08-17T13:23:35.9767418Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:23:36.0002844Z dist init r=1, world=2 2022-08-17T13:23:36.0007683Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:23:36.0008871Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:23:36.0074319Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:23:37.3640129Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:23:37.3640676Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:23:37.3906937Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:37.3907507Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:37.3908220Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:37.3908790Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:38.0511233Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:38.0511762Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:38.0544090Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:23:38.0545003Z warnings.warn( 2022-08-17T13:23:38.0546372Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:23:38.0547161Z warnings.warn( 2022-08-17T13:23:38.8293951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:23:38.8294659Z warnings.warn(msg, FutureWarning) 2022-08-17T13:23:38.8295581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:23:38.8296446Z warnings.warn(msg, FutureWarning) 2022-08-17T13:23:39.0813579Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:39.0814102Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:40.1101476Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:40.1102050Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:41.1394652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:41.1395181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:42.1686671Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:42.1687223Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:43.1980314Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:43.1980846Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:44.2276863Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:44.2277419Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:45.2590859Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:45.2591411Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:46.2906330Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:46.2906878Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:47.3225815Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:47.3226386Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:48.3546216Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:48.3546761Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:48.8641911Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:305: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:23:48.8642794Z subtensor.view(shape) for (subtensor, shape) in 2022-08-17T13:23:48.8644142Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:305: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:23:48.8644971Z subtensor.view(shape) for (subtensor, shape) in 2022-08-17T13:23:49.3884745Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:49.3885287Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:49.9386506Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:49.9388358Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:49.9389738Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:49.9391107Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:49.9397779Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:49.9399063Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:49.9400329Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:49.9401588Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:23:50.5978645Z ok (16.236s) 2022-08-17T13:23:50.5983578Z test_delayed_optim_step_offload_false_shard_grad_op (__main__.TestParityWithDDP) 2022-08-17T13:23:50.5997465Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58500 2022-08-17T13:23:50.6003745Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58501 2022-08-17T13:23:51.9913177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:23:51.9913686Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:23:51.9916387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:23:51.9916892Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:23:52.0574180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:23:52.0574658Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:23:52.0578430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:23:52.0578914Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:23:52.1583805Z dist init r=1, world=2 2022-08-17T13:23:52.1588059Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:23:52.2318861Z dist init r=0, world=2 2022-08-17T13:23:52.2323205Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:23:52.2324127Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:23:52.2403953Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:23:53.5929745Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:23:53.5930260Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:23:53.6185845Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:53.6186430Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:53.6187137Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:23:53.6187684Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:23:54.2868478Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:54.2869032Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:54.2901798Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:23:54.2902581Z warnings.warn( 2022-08-17T13:23:54.2903914Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:23:54.2904676Z warnings.warn( 2022-08-17T13:23:55.0646562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:23:55.0647314Z warnings.warn(msg, FutureWarning) 2022-08-17T13:23:55.0648445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:23:55.0649181Z warnings.warn(msg, FutureWarning) 2022-08-17T13:23:55.3166212Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:55.3166758Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:56.3447736Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:56.3448255Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:57.3727934Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:57.3729128Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:58.4018664Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:58.4019209Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:59.4302778Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:23:59.4303298Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:00.4588245Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:00.4588776Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:01.4899684Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:01.4900210Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:02.5216693Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:02.5217232Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:03.5525340Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:03.5525842Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:04.5830587Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:04.5831111Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:05.0918298Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:305: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:24:05.0919191Z subtensor.view(shape) for (subtensor, shape) in 2022-08-17T13:24:05.0920406Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:305: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:24:05.0921209Z subtensor.view(shape) for (subtensor, shape) in 2022-08-17T13:24:05.6157511Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:05.6158036Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:06.1649637Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:06.1651411Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:06.1653195Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:06.1654610Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:06.1655880Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:06.1657115Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:06.1658371Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:06.1659615Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:06.8351045Z ok (16.237s) 2022-08-17T13:24:06.8355900Z test_delayed_optim_step_offload_true_no_shard (__main__.TestParityWithDDP) 2022-08-17T13:24:06.8360234Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82490 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-08-17T13:24:06.8364624Z test_delayed_optim_step_offload_true_none (__main__.TestParityWithDDP) 2022-08-17T13:24:06.8377203Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58583 2022-08-17T13:24:06.8382964Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58584 2022-08-17T13:24:08.3099355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:24:08.3099852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:24:08.3101970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:24:08.3102656Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:24:08.3255920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:24:08.3256532Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:24:08.3259762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:24:08.3260380Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:24:08.4838152Z dist init r=1, world=2 2022-08-17T13:24:08.4841611Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:24:08.4926774Z dist init r=0, world=2 2022-08-17T13:24:08.4931356Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:24:08.4932584Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:24:08.4945022Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:24:09.8542340Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:24:09.8542849Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:24:09.8827248Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:24:09.8827906Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:24:09.8828622Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:24:09.8829167Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:24:10.5497932Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:10.5498478Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:10.5530446Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:24:10.5531235Z warnings.warn( 2022-08-17T13:24:10.5532365Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:24:10.5533116Z warnings.warn( 2022-08-17T13:24:11.0528637Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:11.0529178Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:11.5549802Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:11.5550550Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:11.5575600Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:11.5577146Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:11.5578441Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:11.5579709Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:11.5581071Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:11.5582313Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:11.5583970Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:11.5585261Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:12.0577426Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:12.0577994Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:12.3078141Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:12.3079549Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:12.5601492Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:12.5601982Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:13.0632095Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:13.0663477Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:13.0664899Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:13.0666232Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:13.0667502Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:13.0668878Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:13.0670143Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:13.0671404Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:13.0672661Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:13.0673914Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:13.5653932Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:13.5654439Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:13.8187177Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:13.8188987Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:14.3667039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:24:14.3667714Z warnings.warn(msg, FutureWarning) 2022-08-17T13:24:14.3669375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:24:14.3670048Z warnings.warn(msg, FutureWarning) 2022-08-17T13:24:14.6187565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:14.6188067Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:15.6698294Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:15.6698839Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:16.7207147Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:16.7207643Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:17.7761796Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:17.7762323Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:18.8283567Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:18.8284115Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:19.8800827Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:19.8801363Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:20.3881463Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:24:20.3882284Z return iter(self.unbind(0)) 2022-08-17T13:24:20.3883432Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:24:20.3884194Z return iter(self.unbind(0)) 2022-08-17T13:24:20.9272874Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:20.9273380Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:22.0171334Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:22.0171872Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:23.0633835Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:23.0634365Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:24.1094270Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:24.1094812Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:25.1555908Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:25.1556449Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:26.3788646Z ok (19.543s) 2022-08-17T13:24:26.3795356Z test_delayed_optim_step_offload_true_shard_grad_op (__main__.TestParityWithDDP) 2022-08-17T13:24:26.3810035Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58666 2022-08-17T13:24:26.3816057Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58667 2022-08-17T13:24:27.8170196Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:24:27.8170703Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:24:27.8172584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:24:27.8173072Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:24:27.8349385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:24:27.8350101Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:24:27.8353718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:24:27.8354205Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:24:27.9874820Z dist init r=1, world=2 2022-08-17T13:24:27.9878881Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:24:28.0082358Z dist init r=0, world=2 2022-08-17T13:24:28.0086793Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:24:28.0087747Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:24:28.0185696Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:24:29.3781106Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:24:29.3781646Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:24:29.4030708Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:24:29.4031282Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:24:29.4031985Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:24:29.4032505Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:24:30.0595991Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:30.0596527Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:30.0628437Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:24:30.0629232Z warnings.warn( 2022-08-17T13:24:30.0630353Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:24:30.0631086Z warnings.warn( 2022-08-17T13:24:30.5646213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:30.5646744Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:31.0672107Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:31.0672681Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:31.0697067Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:31.0698378Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:31.0699813Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:31.0701078Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:31.0702325Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:31.0703939Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:31.0705230Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:31.0706490Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:31.5696960Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:31.5697529Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:31.8198952Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:31.8200281Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:32.0720994Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:32.0721695Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:32.5747739Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:32.5748251Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:32.5782649Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:32.5784424Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:32.5786003Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:32.5787264Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:32.5788525Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:32.5789782Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:32.5791042Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:32.5792295Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:33.0771132Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:33.0771641Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:33.3306004Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:33.3307344Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:33.8770562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:24:33.8771302Z warnings.warn(msg, FutureWarning) 2022-08-17T13:24:33.8772238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:24:33.8772898Z warnings.warn(msg, FutureWarning) 2022-08-17T13:24:34.1291357Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:34.1291876Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:35.1792816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:35.1793355Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:36.2285914Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:36.2286464Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:37.2782555Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:37.2783068Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:38.3294133Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:38.3294700Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:39.3797339Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:39.3797872Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:39.8873341Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:24:39.8874177Z return iter(self.unbind(0)) 2022-08-17T13:24:39.8875315Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:24:39.8876122Z return iter(self.unbind(0)) 2022-08-17T13:24:40.4253355Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:40.4253874Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:41.5139363Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:41.5139884Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:42.5572292Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:42.5572863Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:43.6002783Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:43.6003611Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:44.6437262Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:44.6437801Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:45.9214659Z ok (19.543s) 2022-08-17T13:24:45.9219460Z test_delayed_reduce_scatter_offload_false_no_shard (__main__.TestParityWithDDP) 2022-08-17T13:24:45.9233002Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58749 2022-08-17T13:24:45.9238854Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58750 2022-08-17T13:24:47.3017098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:24:47.3017607Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:24:47.3019717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:24:47.3020207Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:24:47.3410832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:24:47.3411299Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:24:47.3415390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:24:47.3415869Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:24:47.4683216Z dist init r=1, world=2 2022-08-17T13:24:47.4687439Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:24:47.5165921Z dist init r=0, world=2 2022-08-17T13:24:47.5170820Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:24:47.5171915Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:24:47.5197723Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:24:48.8708469Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:24:48.8708998Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:24:48.8953796Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:24:48.8954406Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:24:48.8955113Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:24:48.8955656Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:24:49.3076230Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.3076808Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.3108256Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:24:49.3109066Z warnings.warn( 2022-08-17T13:24:49.3110450Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:24:49.3111202Z warnings.warn( 2022-08-17T13:24:49.3419415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:24:49.3420107Z warnings.warn(msg, FutureWarning) 2022-08-17T13:24:49.3432940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:24:49.3433637Z warnings.warn(msg, FutureWarning) 2022-08-17T13:24:49.3485683Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.3486441Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.3867856Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.3868594Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.4247544Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.4248336Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.4626668Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.4627428Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.5007425Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.5008102Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.5388098Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.5388869Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.5768012Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.5768718Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.6151241Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.6151977Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.6532941Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.6533692Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.6913518Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.6914262Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.7080741Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7082430Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7084023Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7085617Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7087123Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7088641Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7090545Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7091906Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7093449Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7094712Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7095976Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7097225Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7098491Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7099814Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7101074Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7102329Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7103923Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7105168Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7106418Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7107684Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7108936Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7110186Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7111441Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7112706Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7114037Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7115298Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7116553Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7117882Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7119115Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7120359Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7121624Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7122880Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7124130Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7125385Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7126636Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7127887Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7129263Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7130493Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7131742Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7133044Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7134291Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7135539Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7136790Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7138039Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7139284Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7140537Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7141763Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7143008Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7144710Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7145983Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7147231Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7148550Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7149802Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7151046Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7152307Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7153538Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7154793Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7156041Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7157295Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7158620Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7326060Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.7326834Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:49.7914202Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7915672Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7917339Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7922657Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7923917Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:49.7925193Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:50.2349188Z ok (4.313s) 2022-08-17T13:24:50.2353836Z test_delayed_reduce_scatter_offload_false_none (__main__.TestParityWithDDP) 2022-08-17T13:24:50.2358508Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82704 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-08-17T13:24:50.2363086Z test_delayed_reduce_scatter_offload_false_shard_grad_op (__main__.TestParityWithDDP) 2022-08-17T13:24:50.2366656Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82398 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-08-17T13:24:50.2371211Z test_delayed_reduce_scatter_offload_true_no_shard (__main__.TestParityWithDDP) 2022-08-17T13:24:50.2383924Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58832 2022-08-17T13:24:50.2390504Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58833 2022-08-17T13:24:51.6570939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:24:51.6571657Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:24:51.6573649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:24:51.6574115Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:24:51.6930922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:24:51.6931382Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:24:51.6935207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:24:51.6935670Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:24:51.8232277Z dist init r=0, world=2 2022-08-17T13:24:51.8236304Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:24:51.8646704Z dist init r=1, world=2 2022-08-17T13:24:51.8651172Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:24:51.8651969Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:24:51.8746614Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:24:53.2327371Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:24:53.2327940Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:24:53.2553270Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:24:53.2553878Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:24:53.2554573Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:24:53.2555125Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:24:53.6649456Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.6657622Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.6682120Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:24:53.6682907Z warnings.warn( 2022-08-17T13:24:53.6691007Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:24:53.6691773Z warnings.warn( 2022-08-17T13:24:53.6785677Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.6786834Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.6898275Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.6899218Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.6925296Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.6926624Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.6927890Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.6929260Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.6930523Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.6931762Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.6933029Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.6934283Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.7013639Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.7014146Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.7069746Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.7071571Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.7127648Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.7128480Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.7240399Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.7241441Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.7271076Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.7272553Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.7273940Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.7275206Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.7276512Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.7277783Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.7279048Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.7280295Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.7354925Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.7355430Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.7420311Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.7422952Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:53.7836873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:24:53.7837587Z warnings.warn(msg, FutureWarning) 2022-08-17T13:24:53.7842394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:24:53.7843067Z warnings.warn(msg, FutureWarning) 2022-08-17T13:24:53.7893754Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.7894344Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.8408908Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.8409521Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.8919877Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.8920635Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.9429043Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.9429548Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.9942568Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:53.9943884Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:54.0454486Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:54.0454966Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:54.0608690Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:24:54.0609486Z return iter(self.unbind(0)) 2022-08-17T13:24:54.0610608Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:24:54.0611381Z return iter(self.unbind(0)) 2022-08-17T13:24:54.0975479Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:54.0976164Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:54.1923895Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:54.1924679Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:54.2429540Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:54.2430581Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:54.2944287Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:54.2945206Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:54.3453368Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:54.3453888Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:24:54.8503190Z ok (4.613s) 2022-08-17T13:24:54.8508211Z test_delayed_reduce_scatter_offload_true_none (__main__.TestParityWithDDP) 2022-08-17T13:24:54.8512429Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82399 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-08-17T13:24:54.8516530Z test_delayed_reduce_scatter_offload_true_shard_grad_op (__main__.TestParityWithDDP) 2022-08-17T13:24:54.8520053Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82403 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-08-17T13:24:54.8537361Z test_mixture_of_experts_offload_false_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58915 2022-08-17T13:24:54.8543233Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58916 2022-08-17T13:24:56.2791216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:24:56.2791709Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:24:56.2793914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:24:56.2794441Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:24:56.3168564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:24:56.3169036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:24:56.3172754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:24:56.3173232Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:24:56.4465994Z dist init r=1, world=2 2022-08-17T13:24:56.4470338Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:24:56.4919822Z dist init r=0, world=2 2022-08-17T13:24:56.4924207Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:24:56.4925434Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:24:56.4980562Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:24:57.8741763Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:24:57.8742271Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:24:58.3092503Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:24:58.3093321Z warnings.warn( 2022-08-17T13:24:58.3121279Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:24:58.3125768Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:24:58.3126553Z warnings.warn( 2022-08-17T13:24:58.3156543Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:24:58.3157232Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:24:58.3176354Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.3177834Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.3179116Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.3193980Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:24:58.3194548Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:24:58.3224476Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:24:58.3243711Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.3244992Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.3246255Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.3260903Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:24:58.3261467Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:24:58.3773937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:24:58.3774699Z warnings.warn(msg, FutureWarning) 2022-08-17T13:24:58.3775876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:24:58.3776644Z warnings.warn(msg, FutureWarning) 2022-08-17T13:24:58.3867947Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-08-17T13:24:58.3868733Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-08-17T13:24:58.3869434Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:24:58.3870119Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:24:58.4495222Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-08-17T13:24:58.4495774Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-08-17T13:24:58.4496548Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:24:58.4497221Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:24:58.5127537Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-08-17T13:24:58.5128263Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-08-17T13:24:58.5129049Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:24:58.5129723Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:24:58.6127513Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-08-17T13:24:58.6128248Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-08-17T13:24:58.6129017Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:24:58.6129682Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:24:58.6759713Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-08-17T13:24:58.6760442Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-08-17T13:24:58.6761234Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:24:58.6761928Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:24:58.6780978Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.6782306Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.6783958Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.6785454Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.6786866Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.6788238Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.7402095Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-08-17T13:24:58.7402774Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-08-17T13:24:58.7403536Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:24:58.7404227Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:24:58.7452557Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.7453906Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.7455157Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.7456423Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.7457696Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.7458954Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:58.8041351Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-08-17T13:24:58.8041963Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-08-17T13:24:58.8042906Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:24:58.8043635Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:24:58.8096221Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:305: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:24:58.8097100Z subtensor.view(shape) for (subtensor, shape) in 2022-08-17T13:24:58.8098315Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:305: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:24:58.8099271Z subtensor.view(shape) for (subtensor, shape) in 2022-08-17T13:24:58.8687786Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-08-17T13:24:58.8688451Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-08-17T13:24:58.8689225Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:24:58.8689928Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:24:58.8743008Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:775: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:24:58.8744258Z return super(Tensor, self).split_with_sizes(split_size, dim) 2022-08-17T13:24:58.8746264Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:775: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:24:58.8747876Z return super(Tensor, self).split_with_sizes(split_size, dim) 2022-08-17T13:24:58.9335580Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-08-17T13:24:58.9336294Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-08-17T13:24:58.9337080Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:24:58.9337776Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:24:58.9978080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-08-17T13:24:58.9978842Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-08-17T13:24:58.9979633Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:24:58.9980360Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:24:59.0043659Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0045565Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0047795Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0049460Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0050738Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0052010Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0053275Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0055370Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0057781Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0060229Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0062707Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0066177Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0068670Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0071091Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0072580Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0073851Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0075117Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0076371Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0077724Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0078989Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0080253Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0081506Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0082759Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0084064Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0085323Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0086575Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0087915Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0089176Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0090432Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0091692Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0092939Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0094190Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0095420Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0096666Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0097963Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0099239Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0100492Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0101739Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0103048Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0105232Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0106499Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0107735Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0108987Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0110242Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0111492Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0112744Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0114079Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0115339Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0116951Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0118947Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0120212Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0121447Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0122708Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0123964Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0125220Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0126470Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0127718Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0128963Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0130275Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0131538Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0132764Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0134074Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0135317Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0136566Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0137816Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0139061Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0140305Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0141552Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0142795Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0144906Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0146175Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0147421Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0148738Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0149990Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:24:59.0656772Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-08-17T13:24:59.0658073Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:24:59.0658605Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-08-17T13:24:59.0659303Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:24:59.7662703Z ok (4.914s) 2022-08-17T13:24:59.7682963Z test_mixture_of_experts_offload_false_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59118 2022-08-17T13:24:59.7688701Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59119 2022-08-17T13:25:01.2043123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:01.2043631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:01.2045739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:01.2046227Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:01.2357824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:01.2358297Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:01.2362093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:01.2362573Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:01.3709743Z dist init r=1, world=2 2022-08-17T13:25:01.3713793Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:25:01.4100108Z dist init r=0, world=2 2022-08-17T13:25:01.4105006Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:25:01.4106314Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:01.4122040Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:02.8041363Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:25:02.8041964Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:25:03.2472161Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:03.2472989Z warnings.warn( 2022-08-17T13:25:03.2499528Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:25:03.2577487Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:03.2578259Z warnings.warn( 2022-08-17T13:25:03.2608989Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:25:03.2609679Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:03.2630290Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.2631575Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.2632857Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.2649939Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:03.2650520Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:03.2704787Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:03.2724216Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.2725519Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.2726933Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.2742999Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:03.2743923Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:03.3020040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:03.3020726Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:03.3024263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:03.3024948Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:03.3112224Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-08-17T13:25:03.3118206Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-08-17T13:25:03.3119022Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:03.3215691Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:03.3601351Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-08-17T13:25:03.3606857Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-08-17T13:25:03.3607552Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:03.3704535Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:03.4088999Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-08-17T13:25:03.4094290Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-08-17T13:25:03.4095037Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:03.4192100Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:03.4577202Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-08-17T13:25:03.4582915Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-08-17T13:25:03.4583619Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:03.4680406Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:03.5068793Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-08-17T13:25:03.5074464Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-08-17T13:25:03.5075166Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:03.5097677Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.5099119Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.5100484Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.5171459Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:03.5193390Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.5194679Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.5195945Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.5566553Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-08-17T13:25:03.5573526Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-08-17T13:25:03.5574437Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:03.5669524Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:03.5720726Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.5722036Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.5723311Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.5724580Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.5725967Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.5727247Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.6061970Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-08-17T13:25:03.6067867Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-08-17T13:25:03.6068689Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:03.6164634Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:03.6216499Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.6217788Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.6219061Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.6220319Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.6221582Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.6222837Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.6562694Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-08-17T13:25:03.6569253Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-08-17T13:25:03.6570470Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:03.6665697Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:03.6718340Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.6719820Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.6721196Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.6722668Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.6724038Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.6725401Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7060523Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-08-17T13:25:03.7066819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-08-17T13:25:03.7067862Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:03.7163648Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:03.7215997Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7217412Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7218685Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7219942Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7221384Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7222663Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7559279Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-08-17T13:25:03.7565648Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-08-17T13:25:03.7566362Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:03.7662324Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:03.7724583Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7726885Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7729242Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7731439Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7733794Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7735324Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7736584Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7737845Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7739229Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7740508Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7741762Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7743084Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7745046Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7746292Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7747554Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7748779Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7750024Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7751275Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7752525Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7753774Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7755105Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7756361Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7757601Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7758954Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7760202Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7761426Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7762675Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7763916Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7765165Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7766421Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7767657Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7768902Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7770207Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7771459Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7772703Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7773985Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7775227Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7776482Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7777779Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7779024Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7780276Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7781521Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7782779Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7784523Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7785792Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7787016Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7788332Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7789573Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7790825Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7792077Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7793320Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7794569Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7795823Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7797071Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7798316Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7799625Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7800888Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7802135Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7803441Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7804691Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7805933Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7807188Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7808434Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7809678Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7810909Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7812157Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7813394Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7814692Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7815956Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7817203Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7818494Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7819736Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7820993Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.7822219Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.8080962Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-08-17T13:25:03.8087833Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-08-17T13:25:03.8088528Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:03.8183999Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:03.8660671Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.8662020Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.8663725Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.8665297Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.8666582Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.8688294Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.8689734Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.8690990Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.8692250Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:03.8693505Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:04.3798349Z ok (4.613s) 2022-08-17T13:25:04.3816973Z test_mixture_of_experts_offload_false_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59225 2022-08-17T13:25:04.3823188Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59226 2022-08-17T13:25:05.8611819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:05.8612300Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:05.8614954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:05.8615443Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:05.8974810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:05.8975263Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:05.8979153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:05.8979634Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:06.0277923Z dist init r=1, world=2 2022-08-17T13:25:06.0282109Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:25:06.0701633Z dist init r=0, world=2 2022-08-17T13:25:06.0707108Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:25:06.0707898Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:06.0792699Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:07.4610560Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:25:07.4611233Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:25:07.8848016Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:07.8849231Z warnings.warn( 2022-08-17T13:25:07.8875315Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:25:07.8924602Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:07.8925380Z warnings.warn( 2022-08-17T13:25:07.8953819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:25:07.8954626Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:07.8973664Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:07.8974941Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:07.8976222Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:07.8977940Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:07.8993446Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:07.8994009Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:07.8998512Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:07.8999986Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:07.9001404Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:07.9017562Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:07.9018116Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:07.9281428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:07.9282104Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:07.9283040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:07.9283699Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:07.9369645Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-08-17T13:25:07.9370462Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-08-17T13:25:07.9371187Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:07.9372084Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:07.9742228Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-08-17T13:25:07.9742737Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-08-17T13:25:07.9743812Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:07.9744521Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:08.0112548Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-08-17T13:25:08.0113047Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-08-17T13:25:08.0113716Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:08.0114539Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:08.0481083Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-08-17T13:25:08.0481592Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-08-17T13:25:08.0482257Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:08.0482929Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:08.0853096Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-08-17T13:25:08.0853768Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-08-17T13:25:08.0854467Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:08.0855125Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:08.0875761Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.0877093Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.0878480Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.0879744Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.0881009Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.0882266Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.1233442Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-08-17T13:25:08.1234076Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-08-17T13:25:08.1234747Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:08.1235644Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:08.1287840Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.1289124Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.1290392Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.1291741Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.1293015Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.1294269Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.1614586Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-08-17T13:25:08.1616761Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-08-17T13:25:08.1617818Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:08.1717425Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:08.1768439Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.1769862Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.1771141Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.1772398Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.1773666Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.1774921Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.2097891Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-08-17T13:25:08.2098397Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-08-17T13:25:08.2099258Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:08.2099976Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:08.2151089Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.2152362Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.2153770Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.2155031Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.2156271Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.2157522Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.2485399Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-08-17T13:25:08.2576461Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-08-17T13:25:08.2577242Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:08.2587157Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:08.2642168Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.2643450Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.2644730Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.2646163Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.2647437Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.2648679Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.2974858Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-08-17T13:25:08.2975377Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-08-17T13:25:08.2976118Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:08.2976805Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:08.3040314Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3041621Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3042878Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3044131Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3045377Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3046635Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3047897Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3049294Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3050937Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3052453Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3053824Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3055067Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3056320Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3057569Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3058797Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3060070Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3061330Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3062562Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3064181Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3065528Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3066793Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3068043Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3069362Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3070610Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3071857Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3073090Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3074333Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3075587Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3076841Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3078134Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3079434Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3080692Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3081937Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3083234Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3084462Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3085706Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3086955Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3088210Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3089463Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3090716Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3091958Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3093206Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3094511Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3095751Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3096998Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3098316Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3099569Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3100815Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3102060Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3103453Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3104707Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3105968Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3107198Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3108444Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3109761Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3111027Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3112274Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3113579Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3114819Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3116068Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3117323Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3118560Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3119786Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3121040Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3122290Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3123600Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3124858Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3126099Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3127394Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3128639Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3129883Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3131111Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3132364Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3133602Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3383487Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-08-17T13:25:08.3384405Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-08-17T13:25:08.3386563Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:08.3485892Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:08.3966112Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3967658Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3968936Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3970192Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3971527Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3972791Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3974053Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3975317Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3976544Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.3977852Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:08.8931680Z ok (4.513s) 2022-08-17T13:25:08.8950731Z test_mixture_of_experts_offload_true_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59332 2022-08-17T13:25:08.8956757Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59333 2022-08-17T13:25:10.3738631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:10.3739536Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:10.3740758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:10.3741617Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:10.3853374Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:10.3854357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:10.3856696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:10.3857636Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:10.5496568Z dist init r=0, world=2 2022-08-17T13:25:10.5500304Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:25:10.5523834Z dist init r=1, world=2 2022-08-17T13:25:10.5528193Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:25:10.5529520Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:10.5603969Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:11.9369266Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:25:11.9369815Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:25:12.3604758Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:12.3605679Z warnings.warn( 2022-08-17T13:25:12.3631702Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:25:12.3746257Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:12.3747026Z warnings.warn( 2022-08-17T13:25:12.3776955Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:25:12.3777669Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:12.3797375Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.3798688Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.3799943Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.3836602Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:12.3853946Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.3855425Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.3856835Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.3955230Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-08-17T13:25:12.3961630Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-08-17T13:25:12.3962653Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:12.4058087Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:12.4104101Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4105428Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4106713Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4107976Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4109243Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4110493Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4111746Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4113000Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4114459Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4115860Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4117229Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4118687Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4174252Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-08-17T13:25:12.4180970Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-08-17T13:25:12.4181933Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:12.4277422Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:12.4374007Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4375325Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4376826Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4378327Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4379601Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4380851Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4382301Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4383819Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4385091Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4386446Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4387697Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4388945Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4393028Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-08-17T13:25:12.4399941Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-08-17T13:25:12.4400624Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:12.4495959Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:12.4608161Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-08-17T13:25:12.4615353Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-08-17T13:25:12.4616039Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:12.4618747Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4620027Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4711476Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:12.4713368Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4714669Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4826528Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-08-17T13:25:12.4833357Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-08-17T13:25:12.4834359Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:12.4848956Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.4929517Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:12.4943110Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.5054158Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-08-17T13:25:12.5061169Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-08-17T13:25:12.5061892Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:12.5090726Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.5092034Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.5093311Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.5094585Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.5106537Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:12.5107106Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:12.5157238Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:12.5183917Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.5185454Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.5186726Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.5188112Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.5198758Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:12.5199320Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:12.6057712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:12.6058859Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:12.6060600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:12.6061264Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:12.6158043Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-08-17T13:25:12.6165385Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-08-17T13:25:12.6166086Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:12.6260902Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:12.7036167Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-08-17T13:25:12.7040822Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-08-17T13:25:12.7041932Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:12.7137135Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:12.7912953Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-08-17T13:25:12.7917587Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-08-17T13:25:12.7918806Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:12.8014817Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:12.8922580Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-08-17T13:25:12.8927390Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-08-17T13:25:12.8928166Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:12.9023813Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:12.9807562Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-08-17T13:25:12.9811954Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-08-17T13:25:12.9813004Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:12.9849365Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9851279Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9853151Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9854939Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9856827Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9858112Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9859380Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9860639Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9862029Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9863531Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9864810Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9866165Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9867427Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9868687Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9869945Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9871198Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9872440Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9873721Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9874973Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9876229Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9877570Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9878840Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9880089Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9881398Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9882644Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9883895Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9885144Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9886389Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9887619Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9888855Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9890098Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9891340Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9892640Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9893903Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9895150Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9896450Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9897692Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9898944Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9900182Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9901428Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9902683Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9909035Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:12.9942356Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9944614Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9946019Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9947808Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9949128Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9951097Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9952345Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9953584Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9954852Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9956104Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9957350Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9958629Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9959881Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9961222Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9962487Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9963736Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9965020Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9966271Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9967511Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9968772Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9970024Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9971272Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9972525Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9973771Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9975015Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9976325Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9977617Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9978863Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9980168Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9981424Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9982670Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9984074Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9985318Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9986561Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9987789Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9989035Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9990278Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9991610Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9992880Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9994126Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:12.9995429Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:13.0700853Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-08-17T13:25:13.0705590Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-08-17T13:25:13.0706339Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-08-17T13:25:13.0802301Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-08-17T13:25:13.1265963Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:13.1267289Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:13.1268564Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:13.1269834Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:13.1271092Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:13.1272327Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:13.1288936Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:13.1290238Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:13.1291489Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:13.1292900Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:13.1294155Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:13.1295400Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:13.2001246Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-08-17T13:25:13.2004899Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-08-17T13:25:13.2005650Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-08-17T13:25:13.2102417Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-08-17T13:25:13.2878850Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-08-17T13:25:13.2884564Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-08-17T13:25:13.2885334Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-08-17T13:25:13.2980095Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-08-17T13:25:13.3758261Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-08-17T13:25:13.3762723Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-08-17T13:25:13.3763467Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-08-17T13:25:13.3859526Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-08-17T13:25:13.4622059Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-08-17T13:25:13.4627033Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-08-17T13:25:13.4627970Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-08-17T13:25:13.4723366Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-08-17T13:25:13.5488044Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-08-17T13:25:13.5492289Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-08-17T13:25:13.5493026Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-08-17T13:25:13.5589587Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-08-17T13:25:14.2087356Z ok (5.315s) 2022-08-17T13:25:14.2106975Z test_mixture_of_experts_offload_true_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59547 2022-08-17T13:25:14.2112895Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59548 2022-08-17T13:25:15.6533994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:15.6534496Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:15.6537110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:15.6537578Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:15.7066534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:15.7067021Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:15.7071146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:15.7071614Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:15.8214176Z dist init r=0, world=2 2022-08-17T13:25:15.8218030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:25:15.8844598Z dist init r=1, world=2 2022-08-17T13:25:15.8849600Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:25:15.8850819Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:15.8931587Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:17.2384511Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:25:17.2385051Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:25:17.6757275Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:17.6758099Z warnings.warn( 2022-08-17T13:25:17.6766285Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:17.6767047Z warnings.warn( 2022-08-17T13:25:17.6789964Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:25:17.6796146Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:25:17.6797086Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:17.6816170Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.6817473Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.6818919Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.6893333Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:17.6913554Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.6914830Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.6916103Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7017345Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-08-17T13:25:17.7021435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-08-17T13:25:17.7022610Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:17.7120231Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:17.7169216Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7170547Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7172007Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7173292Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7174544Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7175894Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7177142Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7178428Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7179689Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7180942Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7182202Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7183673Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7240835Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-08-17T13:25:17.7254189Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-08-17T13:25:17.7254874Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:17.7343841Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:17.7444135Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7445431Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7447259Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7448657Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7449920Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7451151Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7452425Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7453686Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7454948Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7456203Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7457453Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7458699Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7464198Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-08-17T13:25:17.7471565Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-08-17T13:25:17.7472242Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:17.7567331Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:17.7683720Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-08-17T13:25:17.7688308Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-08-17T13:25:17.7688975Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:17.7691896Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7693177Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7786579Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:17.7789136Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7790412Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.7906690Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-08-17T13:25:17.7911210Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-08-17T13:25:17.7911909Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:17.7929363Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.8009526Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:17.8024387Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.8137705Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-08-17T13:25:17.8142409Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-08-17T13:25:17.8143964Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:17.8179102Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.8180451Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.8181732Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.8183136Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.8197557Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:17.8198293Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:17.8241135Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:17.8271082Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.8272377Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.8273649Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.8274926Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:17.8286419Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:17.8287183Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:17.8676434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:17.8677137Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:17.8679545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:17.8680216Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:17.8780440Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-08-17T13:25:17.8791795Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-08-17T13:25:17.8793228Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:17.8883620Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:17.9401604Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-08-17T13:25:17.9417860Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-08-17T13:25:17.9419293Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:17.9503954Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:18.0022932Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-08-17T13:25:18.0024649Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-08-17T13:25:18.0026157Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:18.0125126Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:18.0628599Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-08-17T13:25:18.0633519Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-08-17T13:25:18.0634935Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:18.0662247Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.0663828Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.0665132Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.0731832Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:18.0755071Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.0756583Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.0757889Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1240473Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-08-17T13:25:18.1244357Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-08-17T13:25:18.1245852Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:18.1343342Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:18.1406948Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1409085Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1410740Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1412019Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1414133Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1416902Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1418177Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1419454Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1420852Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1422121Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1423647Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1425055Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1426313Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1427562Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1428832Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1430080Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1431332Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1432601Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1433836Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1435085Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1436422Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1437693Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1438956Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1440266Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1441515Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1442763Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1444054Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1445298Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1446559Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1447786Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1449033Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1450349Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1451620Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1452867Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1454174Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1455423Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1456668Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1457928Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1459154Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1460401Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1461653Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1462908Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1464359Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1465719Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1466976Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1468236Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1469562Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1470789Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1472036Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1473296Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1474550Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1475805Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1477062Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1478371Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1479677Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1480943Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1482195Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1483433Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1484744Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1486003Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1487262Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1488505Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1489755Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1491006Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1492259Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1493505Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1494855Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1496119Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1497369Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1498684Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1499935Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1501177Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1502430Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1503821Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1505079Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1506320Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1507563Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1508807Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:18.1867068Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-08-17T13:25:18.1871104Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-08-17T13:25:18.1872462Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-08-17T13:25:18.1969873Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-08-17T13:25:18.2447770Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:305: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:25:18.2448903Z subtensor.view(shape) for (subtensor, shape) in 2022-08-17T13:25:18.2454343Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:305: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:25:18.2455201Z subtensor.view(shape) for (subtensor, shape) in 2022-08-17T13:25:18.2908869Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-08-17T13:25:18.2917513Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-08-17T13:25:18.2918911Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-08-17T13:25:18.3011465Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-08-17T13:25:18.3505906Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-08-17T13:25:18.3510176Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-08-17T13:25:18.3511481Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-08-17T13:25:18.3609063Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-08-17T13:25:18.4101001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-08-17T13:25:18.4104908Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-08-17T13:25:18.4106258Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-08-17T13:25:18.4203983Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-08-17T13:25:18.4693459Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-08-17T13:25:18.4696357Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-08-17T13:25:18.4697736Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-08-17T13:25:18.4796432Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-08-17T13:25:18.5289744Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-08-17T13:25:18.5295161Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-08-17T13:25:18.5296469Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-08-17T13:25:18.5392817Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-08-17T13:25:19.0228541Z ok (4.814s) 2022-08-17T13:25:19.0247580Z test_mixture_of_experts_offload_true_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59666 2022-08-17T13:25:19.0253531Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59667 2022-08-17T13:25:20.5292849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:20.5293676Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:20.5295975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:20.5296465Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:20.5319695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:20.5320164Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:20.5324205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:20.5324686Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:20.7035304Z dist init r=1, world=2 2022-08-17T13:25:20.7039671Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:25:20.7061516Z dist init r=0, world=2 2022-08-17T13:25:20.7066238Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:25:20.7066976Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:20.7143053Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:22.0796393Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:25:22.0796912Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:25:22.5231827Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:22.5232676Z warnings.warn( 2022-08-17T13:25:22.5259110Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:25:22.5356007Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:22.5356765Z warnings.warn( 2022-08-17T13:25:22.5386007Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:25:22.5386977Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:22.5407334Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5408654Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5409922Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5464312Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:22.5483099Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5484375Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5485649Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5586682Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-08-17T13:25:22.5590565Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-08-17T13:25:22.5591251Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:22.5689643Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:22.5738035Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5739325Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5740585Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5741988Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5743531Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5744812Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5746163Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5747418Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5748649Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5749900Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5751136Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5752385Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.5809714Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-08-17T13:25:22.5813091Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-08-17T13:25:22.5813784Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:22.5912510Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:22.6012326Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6013978Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6015276Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6027278Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6028928Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6030293Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6031924Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6033207Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6035010Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6036502Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6037777Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6039031Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6039786Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-08-17T13:25:22.6040293Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-08-17T13:25:22.6040941Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:22.6135873Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:22.6251367Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-08-17T13:25:22.6255271Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-08-17T13:25:22.6255949Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:22.6258460Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6259911Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6354291Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:22.6356426Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6357696Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6473004Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-08-17T13:25:22.6478356Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-08-17T13:25:22.6479059Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:22.6494315Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6575830Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:22.6591085Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6704290Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-08-17T13:25:22.6708399Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-08-17T13:25:22.6709103Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:22.6739128Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6740548Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6741842Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6743102Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6755753Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:22.6756318Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:22.6806280Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:22.6835530Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6836825Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6838085Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6839347Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.6851298Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:22.6851867Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:22.7239323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:22.7240098Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:22.7242441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:22.7243102Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:22.7338258Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-08-17T13:25:22.7343316Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-08-17T13:25:22.7343994Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:22.7441497Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:22.7942849Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-08-17T13:25:22.7948263Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-08-17T13:25:22.7948944Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:22.8045906Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:22.8548086Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-08-17T13:25:22.8553166Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-08-17T13:25:22.8553848Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:22.8651056Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:22.9151806Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-08-17T13:25:22.9157137Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-08-17T13:25:22.9157840Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:22.9181581Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9182878Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9184322Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9254966Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:22.9277724Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9279013Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9280449Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9757618Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-08-17T13:25:22.9762651Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-08-17T13:25:22.9763606Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:22.9860901Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:22.9923288Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9925170Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9927067Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9929316Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9931111Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9932398Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9933646Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9934889Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9936140Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9937515Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9938797Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9940050Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9941365Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9942619Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9944148Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9945400Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9946649Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9947898Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9949155Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9950412Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9951652Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9953033Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9954296Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9955540Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9956960Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9958412Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9959642Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9960894Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9962129Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9963380Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9964632Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9965874Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9967184Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9968442Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9969683Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9970982Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9972209Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9973448Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9974701Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9975934Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9977184Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9978471Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9979715Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9980957Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9982270Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9983660Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9984924Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9986271Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9987517Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9988769Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9990026Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9991271Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9992512Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9993761Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9994987Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9996231Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9997548Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:22.9998805Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0000053Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0001356Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0002598Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0003848Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0005096Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0006322Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0007564Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0008809Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0010056Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0011355Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0012600Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0013842Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0015171Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0016415Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0017653Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0018885Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0020188Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0021433Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0022690Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0024072Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:23.0381736Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-08-17T13:25:23.0387495Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-08-17T13:25:23.0388188Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-08-17T13:25:23.0484810Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-08-17T13:25:23.0985995Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:305: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:25:23.0986884Z subtensor.view(shape) for (subtensor, shape) in 2022-08-17T13:25:23.1027574Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:305: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:25:23.1028716Z subtensor.view(shape) for (subtensor, shape) in 2022-08-17T13:25:23.1474297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-08-17T13:25:23.1479627Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-08-17T13:25:23.1480322Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-08-17T13:25:23.1577376Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-08-17T13:25:23.2069290Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-08-17T13:25:23.2073224Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-08-17T13:25:23.2074252Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-08-17T13:25:23.2172403Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-08-17T13:25:23.2662171Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-08-17T13:25:23.2666197Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-08-17T13:25:23.2667144Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-08-17T13:25:23.2764351Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-08-17T13:25:23.3266334Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-08-17T13:25:23.3268253Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-08-17T13:25:23.3268986Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-08-17T13:25:23.3367959Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-08-17T13:25:23.3859404Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-08-17T13:25:23.3863630Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-08-17T13:25:23.3864585Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-08-17T13:25:23.3962632Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-08-17T13:25:23.8369990Z ok (4.814s) 2022-08-17T13:25:23.8388195Z test_mixture_of_experts_with_delay_before_free_offload_false_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59785 2022-08-17T13:25:23.8393655Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59786 2022-08-17T13:25:25.2981949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:25.2982460Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:25.2984987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:25.2985471Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:25.3081910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:25.3082379Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:25.3086406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:25.3086890Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:25.4726607Z dist init r=1, world=2 2022-08-17T13:25:25.4729934Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:25:25.4799854Z dist init r=0, world=2 2022-08-17T13:25:25.4804478Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:25:25.4805495Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:25.4833366Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:26.8567766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:25:26.8568292Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:25:27.2853925Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:27.2854738Z warnings.warn( 2022-08-17T13:25:27.2880999Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:25:27.2951717Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:27.2952484Z warnings.warn( 2022-08-17T13:25:27.2981747Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:25:27.2982528Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:27.2983945Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:27.3002013Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.3003348Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.3004629Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.3006006Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.3007270Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.3008532Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.3018038Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:27.3018608Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:27.3019282Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:27.3019821Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:27.3523128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:27.3523826Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:27.3531352Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:27.3532001Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:27.3621042Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-08-17T13:25:27.3625972Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-08-17T13:25:27.3627023Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:27.3724525Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:27.4340541Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-08-17T13:25:27.4344339Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-08-17T13:25:27.4345327Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:27.4442007Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:27.5062428Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-08-17T13:25:27.5066100Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-08-17T13:25:27.5066985Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:27.5164314Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:27.6350984Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-08-17T13:25:27.6352594Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-08-17T13:25:27.6353436Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:27.6451910Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:27.7076612Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-08-17T13:25:27.7079360Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-08-17T13:25:27.7080115Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:27.7101724Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.7103026Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.7104590Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.7177986Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:27.7199495Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.7200758Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.7202012Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.7815772Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-08-17T13:25:27.7818715Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-08-17T13:25:27.7819488Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:27.7917192Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:27.7959620Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.7961082Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.7962335Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.7963613Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.7964885Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.7966147Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.8556070Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-08-17T13:25:27.8559107Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-08-17T13:25:27.8559901Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:27.8657455Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:27.8701195Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.8702492Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.8704145Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.8705553Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.8706922Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.8708394Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.9290392Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-08-17T13:25:27.9293601Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-08-17T13:25:27.9294371Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:27.9391953Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:27.9436250Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.9437570Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.9438853Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.9440107Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.9441363Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:27.9442611Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0023690Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-08-17T13:25:28.0027195Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-08-17T13:25:28.0027976Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:28.0124773Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:28.0169052Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0170356Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0171779Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0173049Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0174313Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0175575Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0756024Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-08-17T13:25:28.0759648Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-08-17T13:25:28.0760795Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:28.0857541Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:28.0912988Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0914303Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0915580Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0917095Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0918420Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0919695Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0921053Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0922302Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0923568Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0924841Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0926097Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0927365Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0928620Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0929869Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0931182Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0932451Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0933697Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0934990Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0936244Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0937495Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0938757Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0940008Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0941257Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0942515Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0945169Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0946432Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0947812Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0949085Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0950317Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0951650Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0952899Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0954154Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0955411Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0956665Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0957906Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0959161Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0960408Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0961693Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0962953Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0964203Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0965453Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0966762Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0968008Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0969266Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0970505Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0971751Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0972978Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0974225Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0975463Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0976769Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0978063Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0979306Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0980634Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0981889Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0983134Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0984638Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0985886Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0987135Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0988386Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0989635Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0990884Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0992206Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0993476Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0994720Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0996044Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0997264Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0998508Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.0999760Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.1001007Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.1002253Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.1003504Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.1004746Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.1006036Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.1007297Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.1008527Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.1009834Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.1011083Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.1012334Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.1529359Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-08-17T13:25:28.1535055Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-08-17T13:25:28.1536417Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:28.1631407Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:28.2099372Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.2100837Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.2102145Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.2103665Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.2105261Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.2106554Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.2125631Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.2127274Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.2128541Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.2129780Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.2131036Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.2132296Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:28.8517896Z ok (5.015s) 2022-08-17T13:25:28.8537403Z test_mixture_of_experts_with_delay_before_free_offload_false_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59988 2022-08-17T13:25:28.8543263Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59989 2022-08-17T13:25:30.3138616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:30.3139126Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:30.3141591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:30.3142076Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:30.3419326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:30.3419793Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:30.3423776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:30.3424259Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:30.4888132Z dist init r=1, world=2 2022-08-17T13:25:30.4892363Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:25:30.5111258Z dist init r=0, world=2 2022-08-17T13:25:30.5115996Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:25:30.5116889Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:30.5199562Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:31.9167752Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:25:31.9168266Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:25:32.3512893Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:32.3514030Z warnings.warn( 2022-08-17T13:25:32.3540944Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:25:32.3542220Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:32.3542976Z warnings.warn( 2022-08-17T13:25:32.3573392Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:25:32.3574127Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:32.3595124Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:32.3596414Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:32.3597694Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:32.3614795Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:32.3615341Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:32.3643972Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:32.3662560Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:32.3664260Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:32.3665554Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:32.3681833Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:32.3682497Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:32.8459909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:32.8460606Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:32.8461543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:32.8462187Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:32.8554765Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-08-17T13:25:32.8555303Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-08-17T13:25:32.8556008Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:32.8556692Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:33.2219025Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-08-17T13:25:33.2219551Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-08-17T13:25:33.2220328Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:33.2221028Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:33.5594905Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-08-17T13:25:33.5595433Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-08-17T13:25:33.5596155Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:33.5596857Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:33.8970025Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-08-17T13:25:33.8970544Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-08-17T13:25:33.8971255Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:33.8971952Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:34.2354902Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-08-17T13:25:34.2355647Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-08-17T13:25:34.2356441Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:34.2357137Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:34.2376477Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:34.2377801Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:34.2379258Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:34.2380526Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:34.2381800Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:34.2383063Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:34.5740693Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-08-17T13:25:34.5741681Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-08-17T13:25:34.5743038Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:34.5843475Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:34.5890262Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:34.5891569Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:34.5892849Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:34.5894306Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:34.5895606Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:34.5896857Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:34.9228395Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-08-17T13:25:34.9229088Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-08-17T13:25:34.9229809Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:34.9230505Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:34.9281763Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:34.9283110Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:34.9284381Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:34.9285651Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:34.9286921Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:34.9288182Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.2621888Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-08-17T13:25:35.2622985Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-08-17T13:25:35.2624252Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:35.2624957Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:35.2673043Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.2674374Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.2675817Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.2677074Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.2678375Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.2679648Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.6008713Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-08-17T13:25:35.6009811Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-08-17T13:25:35.6010519Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:35.6011188Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:35.6058324Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.6059657Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.6060920Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.6062347Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.6064017Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.6065406Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9392688Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-08-17T13:25:35.9393832Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-08-17T13:25:35.9394724Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:35.9395415Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:35.9453075Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9455412Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9457708Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9460422Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9462709Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9464759Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9466047Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9467444Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9468730Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9469981Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9471314Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9472561Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9473807Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9475056Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9476299Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9477526Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9478844Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9480082Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9481322Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9482640Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9483902Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9485149Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9486454Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9487693Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9488934Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9490163Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9491406Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9492653Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9493904Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9495154Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9496449Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9497699Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9498940Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9500243Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9501482Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9502707Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9504147Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9505400Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9506648Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9507891Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9509133Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9510375Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9511701Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9512963Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9514182Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9515503Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9516742Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9517985Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9519236Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9520479Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9521720Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9522970Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9524215Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9525438Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9526761Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9528015Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9529256Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9530571Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9531818Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9533072Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9534290Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9535532Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9536783Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9538036Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9539281Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9540578Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9541838Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9543087Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9544536Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9545766Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9547010Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9548272Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9549520Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9550768Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9552021Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:35.9553266Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:36.2809225Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-08-17T13:25:36.2809769Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-08-17T13:25:36.2810564Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:36.2811466Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:36.3286409Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:36.3287833Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:36.3289392Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:36.3290657Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:36.3291918Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:36.3308776Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:36.3310050Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:36.3311316Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:36.3312572Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:36.3313827Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:37.0726678Z ok (8.221s) 2022-08-17T13:25:37.0745227Z test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60095 2022-08-17T13:25:37.0751403Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60096 2022-08-17T13:25:38.5390098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:38.5390643Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:38.5392901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:38.5393406Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:38.5861016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:38.5861492Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:38.5865677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:38.5866374Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:38.7084898Z dist init r=0, world=2 2022-08-17T13:25:38.7089555Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:25:38.7592532Z dist init r=1, world=2 2022-08-17T13:25:38.7597556Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:25:38.7599215Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:38.7599926Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:40.1174227Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:25:40.1174744Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:25:40.6062365Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:40.6063195Z warnings.warn( 2022-08-17T13:25:40.6091206Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:25:40.6102056Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:40.6102820Z warnings.warn( 2022-08-17T13:25:40.6133692Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:25:40.6134394Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:40.6154825Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.6156110Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.6157631Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.6174387Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:40.6174937Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:40.6194273Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:40.6213117Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.6214560Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.6215822Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.6233288Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:40.6233838Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:40.6512843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:40.6513521Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:40.6514427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:40.6515084Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:40.6606445Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-08-17T13:25:40.6606960Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-08-17T13:25:40.6607632Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:40.6608326Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:40.6997058Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-08-17T13:25:40.6997566Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-08-17T13:25:40.6998256Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:40.6999381Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:40.7386836Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-08-17T13:25:40.7387470Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-08-17T13:25:40.7388223Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:40.7389255Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:40.7776264Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-08-17T13:25:40.7776776Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-08-17T13:25:40.7777437Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:40.7778254Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:40.8168426Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-08-17T13:25:40.8169168Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-08-17T13:25:40.8169839Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:40.8170511Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:40.8191452Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.8192751Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.8194021Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.8195294Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.8196557Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.8197814Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.8568678Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-08-17T13:25:40.8569173Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-08-17T13:25:40.8569859Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:40.8570673Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:40.8617694Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.8618974Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.8620373Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.8621619Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.8622874Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.8624312Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.8968123Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-08-17T13:25:40.8969832Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-08-17T13:25:40.8970520Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:40.9071206Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:40.9123245Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.9124533Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.9125805Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.9127173Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.9128456Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.9129706Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.9472931Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-08-17T13:25:40.9473430Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-08-17T13:25:40.9474110Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:40.9474795Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:40.9523026Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.9524312Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.9525591Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.9526834Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.9528095Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.9529349Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.9874351Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-08-17T13:25:40.9874864Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-08-17T13:25:40.9875540Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:40.9876218Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:40.9924567Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.9925863Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.9927141Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.9928478Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.9929737Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:40.9930994Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0275196Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-08-17T13:25:41.0275695Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-08-17T13:25:41.0276372Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:41.0277057Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:41.0336596Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0338606Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0340103Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0342124Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0344534Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0346152Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0347431Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0348789Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0350046Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0351290Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0352547Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0353793Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0355047Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0356283Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0357532Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0358878Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0360133Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0361377Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0362698Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0363951Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0365195Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0366449Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0367697Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0368947Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0370178Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0371430Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0372682Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0373990Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0375238Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0376482Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0377786Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0379065Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0380319Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0381559Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0382810Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0384238Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0385496Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0386748Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0387996Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0389326Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0390583Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0391824Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0393122Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0394371Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0395619Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0396868Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0398124Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0399384Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0400628Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0401868Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0403162Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0404419Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0405646Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0406954Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0408202Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0409453Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0410707Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0411952Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0413194Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0414450Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0415699Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0416930Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0418227Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0419481Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0420729Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0422040Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0423407Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0424668Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0425922Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0427170Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0428397Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0429644Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.0697859Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-08-17T13:25:41.0698378Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-08-17T13:25:41.0699057Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:41.0699728Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:41.1188039Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.1189384Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.1190645Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.1192038Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.1193895Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.1195165Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.1196435Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.1197689Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.1198944Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.1200187Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:41.5861482Z ok (4.513s) 2022-08-17T13:25:41.5878952Z test_mixture_of_experts_with_delay_before_free_offload_true_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60202 2022-08-17T13:25:41.5884994Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60203 2022-08-17T13:25:43.0139180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:43.0139759Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:43.0141955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:43.0142481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:43.0337707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:43.0338180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:43.0342038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:43.0342515Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:43.1816224Z dist init r=0, world=2 2022-08-17T13:25:43.1820377Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:25:43.2050203Z dist init r=1, world=2 2022-08-17T13:25:43.2054681Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:25:43.2055874Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:43.2127389Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:44.5597777Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:25:44.5598300Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:25:45.0074221Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:45.0075057Z warnings.warn( 2022-08-17T13:25:45.0092620Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:45.0093366Z warnings.warn( 2022-08-17T13:25:45.0101093Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:25:45.0125782Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:25:45.0126820Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:45.0147524Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0148837Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0150107Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0204546Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:45.0222023Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0223559Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0225002Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0323923Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-08-17T13:25:45.0324437Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-08-17T13:25:45.0325092Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:45.0325790Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:45.0375528Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0377005Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0378311Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0379590Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0380845Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0382096Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0383778Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0385080Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0386334Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0387675Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0388933Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0390185Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0447028Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-08-17T13:25:45.0453710Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-08-17T13:25:45.0454379Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:45.0550019Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:45.0646586Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0647883Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0649154Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0650416Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0651777Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0653042Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0654289Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0655621Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0656874Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0658129Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0659378Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0660629Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0666983Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-08-17T13:25:45.0668248Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-08-17T13:25:45.0669654Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:45.0769655Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:45.0883169Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-08-17T13:25:45.0883666Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-08-17T13:25:45.0884600Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:45.0885297Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:45.0886519Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0887905Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0889209Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.0890465Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.1004769Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-08-17T13:25:45.1011993Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-08-17T13:25:45.1012981Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:45.1028941Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.1106991Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:45.1120588Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.1231152Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-08-17T13:25:45.1231663Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-08-17T13:25:45.1232330Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:45.1233015Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:45.1259191Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.1260446Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.1261718Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.1263090Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.1264564Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.1265815Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.1267169Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.1268418Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.1273739Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:45.1274301Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:45.1275124Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:45.1275674Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:45.1941765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:45.1942462Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:45.1943659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:45.1944348Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:45.2041156Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-08-17T13:25:45.2050008Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-08-17T13:25:45.2050830Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:45.2144126Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:45.2915706Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-08-17T13:25:45.2916208Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-08-17T13:25:45.2916940Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:45.2917633Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:45.3687995Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-08-17T13:25:45.3688552Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-08-17T13:25:45.3689330Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:45.3690015Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:45.4728856Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-08-17T13:25:45.4729402Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-08-17T13:25:45.4730496Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:45.4731183Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:45.5497428Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-08-17T13:25:45.5497971Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-08-17T13:25:45.5498721Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:45.5499417Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:45.5531199Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5532584Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5534237Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5535902Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5537346Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5539186Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5542121Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5545188Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5546797Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5548059Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5549442Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5550699Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5551955Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5553202Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5554444Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5555698Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5556946Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5558191Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5559511Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5560811Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5562035Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5563350Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5564586Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5565839Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5567087Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5568331Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5569566Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5570813Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5572059Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5573303Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5574589Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5575846Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5577090Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5578432Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5579681Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5580924Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5582171Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5583644Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5584911Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5586139Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5587385Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5588711Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5589984Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5591233Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5592554Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5593793Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5595032Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5596274Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5597502Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5598747Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5600003Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5601250Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5602497Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5603799Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5605054Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5606292Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5607594Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5608838Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5610065Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5611316Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5612567Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5613818Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5615065Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5616310Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5617553Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5618856Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5620109Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5621331Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5622637Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5624024Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5625279Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5626527Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5627771Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5629016Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5630263Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5631509Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5632816Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5634069Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5635312Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5636695Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5637944Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.5639184Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.6298863Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-08-17T13:25:45.6300031Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-08-17T13:25:45.6300778Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-08-17T13:25:45.6301463Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-08-17T13:25:45.6760105Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.6761497Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.6762773Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.6764031Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.6765630Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.6766912Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.6771930Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.6773769Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.6775038Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.6776280Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.6777540Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.6778835Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:45.7510780Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-08-17T13:25:45.7511844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-08-17T13:25:45.7512630Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-08-17T13:25:45.7513316Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-08-17T13:25:45.8286568Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-08-17T13:25:45.8287704Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-08-17T13:25:45.8288515Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-08-17T13:25:45.8289184Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-08-17T13:25:45.9124729Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-08-17T13:25:45.9125607Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-08-17T13:25:45.9126392Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-08-17T13:25:45.9127272Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-08-17T13:25:45.9889990Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-08-17T13:25:45.9891033Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-08-17T13:25:45.9891789Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-08-17T13:25:45.9892453Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-08-17T13:25:46.0662912Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-08-17T13:25:46.0664297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-08-17T13:25:46.0665078Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-08-17T13:25:46.0665775Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-08-17T13:25:46.7010115Z ok (5.115s) 2022-08-17T13:25:46.7028776Z test_mixture_of_experts_with_delay_before_free_offload_true_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60417 2022-08-17T13:25:46.7034369Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60418 2022-08-17T13:25:48.1249501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:48.1250017Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:48.1251386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:48.1251870Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:48.1613602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:48.1614068Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:48.1617704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:48.1618166Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:48.2920315Z dist init r=0, world=2 2022-08-17T13:25:48.2923667Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:25:48.3336939Z dist init r=1, world=2 2022-08-17T13:25:48.3341412Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:25:48.3342336Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:48.3434114Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:49.6923412Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:25:49.6924114Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:25:50.1298513Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:50.1299359Z warnings.warn( 2022-08-17T13:25:50.1313876Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:25:50.1314663Z warnings.warn( 2022-08-17T13:25:50.1326706Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:25:50.1343938Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:25:50.1345258Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:50.1365769Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1367078Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1368339Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1430491Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:25:50.1452302Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1453600Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1454878Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1564207Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-08-17T13:25:50.1602147Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-08-17T13:25:50.1603468Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:50.1667464Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:25:50.1720013Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1721637Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1722917Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1724183Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1725543Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1726798Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1728047Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1729290Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1730536Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1731791Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1733050Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1734308Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1796676Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-08-17T13:25:50.1797776Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-08-17T13:25:50.1798692Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:50.1799410Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:25:50.1904730Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1906094Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1907499Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1908754Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1910007Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1911263Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1912498Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1913750Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1915001Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1916251Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1917577Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1918844Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.1925113Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-08-17T13:25:50.1926221Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-08-17T13:25:50.1926941Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:50.1927728Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:25:50.2049038Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-08-17T13:25:50.2050547Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-08-17T13:25:50.2051961Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:50.2054540Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.2055819Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.2152312Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:25:50.2154985Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.2156280Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.2276330Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-08-17T13:25:50.2277338Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-08-17T13:25:50.2278753Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:50.2279461Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:25:50.2294046Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.2295443Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.2411834Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-08-17T13:25:50.2412883Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-08-17T13:25:50.2414058Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:50.2415076Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:25:50.2444693Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.2446110Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.2447352Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.2448609Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.2449864Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.2451125Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.2452388Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.2453633Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:50.2461368Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:50.2462258Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:50.2462946Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:25:50.2463808Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:25:50.7896262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:50.7896978Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:50.7897910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:25:50.7898733Z warnings.warn(msg, FutureWarning) 2022-08-17T13:25:50.7997924Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-08-17T13:25:50.7998939Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-08-17T13:25:50.8000362Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:50.8001068Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:25:51.3479224Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-08-17T13:25:51.3480153Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-08-17T13:25:51.3480905Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:51.3481631Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:25:51.8944413Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-08-17T13:25:51.8945526Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-08-17T13:25:51.8946956Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:51.9046905Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:25:52.4521402Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-08-17T13:25:52.4522357Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-08-17T13:25:52.4523122Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:52.4523834Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:25:52.9988018Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-08-17T13:25:52.9989129Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-08-17T13:25:52.9990315Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:52.9990998Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:25:53.0025311Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0028346Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0030590Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0033064Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0035773Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0037031Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0038294Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0039559Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0040798Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0042040Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0043299Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0044557Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0045878Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0047138Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0048382Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0049627Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0050935Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0052162Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0053467Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0054694Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0055946Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0057190Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0058436Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0059693Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0061377Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0062644Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0064209Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0065585Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0066817Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0068061Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0069315Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0070567Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0071820Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0073065Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0074308Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0075555Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0076876Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0078117Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0079386Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0080691Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0081938Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0083189Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0084446Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0085692Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0086934Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0088180Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0089426Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0090705Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0091975Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0093219Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0094526Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0095766Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0097010Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0098261Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0099503Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0100744Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0101972Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0103212Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0104599Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0105933Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0107203Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0108444Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0109764Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0111010Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0112245Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0113477Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0114715Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0115965Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0117218Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0118465Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0119711Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0121014Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0122268Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0123506Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0125861Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0127136Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0128391Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0129647Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0130901Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0132157Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0133406Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.0134648Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.5482714Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-08-17T13:25:53.5483713Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-08-17T13:25:53.5484715Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-08-17T13:25:53.5485429Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-08-17T13:25:53.5965669Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.5967052Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.5968597Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.5969864Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.5971134Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.5972401Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.5973654Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.5974907Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.5976164Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.5977413Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.5978790Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:53.5980062Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:25:54.1378922Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-08-17T13:25:54.1380231Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-08-17T13:25:54.1381413Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-08-17T13:25:54.1482025Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-08-17T13:25:54.6945926Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-08-17T13:25:54.6946936Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-08-17T13:25:54.6947480Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-08-17T13:25:54.6948126Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-08-17T13:25:55.2400018Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-08-17T13:25:55.2400739Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-08-17T13:25:55.2401432Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-08-17T13:25:55.2402130Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-08-17T13:25:55.7861179Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-08-17T13:25:55.7861698Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-08-17T13:25:55.7862454Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-08-17T13:25:55.7863134Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-08-17T13:25:56.3317322Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-08-17T13:25:56.3318080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-08-17T13:25:56.3318870Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-08-17T13:25:56.3319569Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-08-17T13:25:57.3282076Z ok (10.627s) 2022-08-17T13:25:57.3301821Z test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60536 2022-08-17T13:25:57.3307821Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60537 2022-08-17T13:25:58.7520785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:58.7521283Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:58.7523869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:58.7524957Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:58.7893309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:25:58.7893781Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:25:58.7897160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:25:58.7897639Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:25:58.9192715Z dist init r=1, world=2 2022-08-17T13:25:58.9196516Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:25:58.9644419Z dist init r=0, world=2 2022-08-17T13:25:58.9649561Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:25:58.9650422Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:25:58.9706622Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:00.3440846Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:26:00.3441359Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:26:00.7722366Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:00.7723203Z warnings.warn( 2022-08-17T13:26:00.7749328Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:26:00.7843002Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:00.7843766Z warnings.warn( 2022-08-17T13:26:00.7873926Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:26:00.7874771Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:26:00.7894265Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.7895546Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.7896822Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.7954050Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:26:00.7973073Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.7974374Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.7975630Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8075924Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-08-17T13:26:00.8080926Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-08-17T13:26:00.8081786Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:26:00.8178799Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-08-17T13:26:00.8226316Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8228061Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8229321Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8230596Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8231841Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8233101Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8234344Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8235755Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8237045Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8238291Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8239637Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8240891Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8298240Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-08-17T13:26:00.8302281Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-08-17T13:26:00.8303110Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:26:00.8401416Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-08-17T13:26:00.8513205Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8514573Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8515821Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8517064Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8518283Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8520114Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8521383Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8522621Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8524752Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8526021Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8527258Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8528491Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8529225Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-08-17T13:26:00.8529694Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-08-17T13:26:00.8530346Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:26:00.8622604Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-08-17T13:26:00.8738509Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-08-17T13:26:00.8742914Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-08-17T13:26:00.8744014Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:26:00.8746610Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8747880Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8841598Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-08-17T13:26:00.8843275Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8844534Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.8959873Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-08-17T13:26:00.8964567Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-08-17T13:26:00.8965264Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:26:00.8980176Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.9063441Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-08-17T13:26:00.9077738Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.9190379Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-08-17T13:26:00.9194439Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-08-17T13:26:00.9195422Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:26:00.9225483Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.9226803Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.9228056Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.9229330Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.9242454Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:00.9243057Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:00.9293534Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-08-17T13:26:00.9322093Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.9323640Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.9325320Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.9326854Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:00.9337753Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:00.9338333Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:00.9772893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:00.9773631Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:00.9848351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:00.9849325Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:00.9951619Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-08-17T13:26:00.9952462Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-08-17T13:26:00.9953969Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:26:01.0053931Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-08-17T13:26:01.0565427Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-08-17T13:26:01.0571371Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-08-17T13:26:01.0572074Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:26:01.0667096Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-08-17T13:26:01.1172048Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-08-17T13:26:01.1177936Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-08-17T13:26:01.1178664Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:26:01.1274468Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-08-17T13:26:01.1777442Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-08-17T13:26:01.1781908Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-08-17T13:26:01.1783142Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:26:01.1805985Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.1807369Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.1808647Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.1880093Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-08-17T13:26:01.1900296Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.1901589Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.1902858Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2387365Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-08-17T13:26:01.2392736Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-08-17T13:26:01.2393628Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:26:01.2443232Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2445273Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2447253Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2449294Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2450705Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2451971Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2453205Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2454470Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2455716Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2456972Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2458228Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2459477Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2460728Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2462069Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2463616Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2464897Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2466256Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2467511Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2468763Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2470024Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2471280Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2472530Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2473780Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2475031Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2476412Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2477669Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2478942Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2480199Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2481518Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2482777Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2484035Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2485282Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2486528Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2487782Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2489034Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2490846Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2492187Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2493455Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2494711Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2495689Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-08-17T13:26:01.2537011Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2538315Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2540229Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2541840Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2543765Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2545059Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2546324Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2547573Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2548956Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2550239Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2551482Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2552816Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2554056Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2555292Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2556526Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2557794Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2559052Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2560298Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2561550Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2562793Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2564099Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2565356Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2566600Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2567927Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2569150Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2570386Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2571632Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2572880Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2574134Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2575386Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2576635Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2577882Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2579235Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2580484Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2581726Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2583043Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2584518Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2585774Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.2587029Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.3008101Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-08-17T13:26:01.3012959Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-08-17T13:26:01.3013687Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-08-17T13:26:01.3110652Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-08-17T13:26:01.3558897Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.3560220Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.3561484Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.3563009Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.3596335Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.3597610Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.3599007Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.3600267Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:01.4057256Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-08-17T13:26:01.4061672Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-08-17T13:26:01.4062435Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-08-17T13:26:01.4159992Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-08-17T13:26:01.4655959Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-08-17T13:26:01.4661451Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-08-17T13:26:01.4662167Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-08-17T13:26:01.4758709Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-08-17T13:26:01.5252663Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-08-17T13:26:01.5257370Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-08-17T13:26:01.5258072Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-08-17T13:26:01.5355423Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-08-17T13:26:01.5846360Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-08-17T13:26:01.5851089Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-08-17T13:26:01.5851961Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-08-17T13:26:01.5949224Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-08-17T13:26:01.6442549Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-08-17T13:26:01.6448373Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-08-17T13:26:01.6449120Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-08-17T13:26:01.6545474Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-08-17T13:26:02.1423852Z ok (4.814s) 2022-08-17T13:26:02.1442734Z test_nested_always_wrap_model_offload_false_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60655 2022-08-17T13:26:02.1449046Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60656 2022-08-17T13:26:03.5977190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:03.5977712Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:03.5980251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:03.5980741Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:03.6727493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:03.6727977Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:03.6731826Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:03.6732314Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:03.7657096Z dist init r=0, world=2 2022-08-17T13:26:03.7661442Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:26:03.8463708Z dist init r=1, world=2 2022-08-17T13:26:03.8468652Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:26:03.8469582Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:03.8475965Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:05.2087815Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:26:05.2088340Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:26:05.2347873Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:05.2348463Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:05.2349181Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:05.2349730Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:05.6578924Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:05.6588066Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:05.6617721Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:05.6618520Z warnings.warn( 2022-08-17T13:26:05.6627250Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:05.6628045Z warnings.warn( 2022-08-17T13:26:05.7048935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:05.7049765Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:05.7062565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:05.7063226Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:05.7113781Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:05.7114944Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:05.7611987Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:05.7613280Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:05.8105780Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:05.8108071Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:05.8596661Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:05.8597929Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:05.9096524Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:05.9098525Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:05.9593153Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:05.9594578Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:06.0087686Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:06.0088696Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:06.0586833Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:06.0588983Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:06.0644752Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0646074Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0648446Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0649923Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0651850Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0653551Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0654847Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0656100Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0657365Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0658609Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0659856Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0661093Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0662341Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0663898Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0665284Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0666562Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0667807Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0669145Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0670391Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0671638Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0672869Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0674117Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0675352Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0676612Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0677859Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0679131Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0680450Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.0681711Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:06.1101702Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:06.1102970Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:06.2041313Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:06.2041852Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:06.2540305Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:06.2542411Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:06.7560428Z ok (4.614s) 2022-08-17T13:26:06.7578373Z test_nested_always_wrap_model_offload_false_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60738 2022-08-17T13:26:06.7584418Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60739 2022-08-17T13:26:08.2341874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:08.2342381Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:08.2344773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:08.2345263Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:08.2590142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:08.2590601Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:08.2594371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:08.2594847Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:08.4077976Z dist init r=1, world=2 2022-08-17T13:26:08.4081885Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:26:08.4254296Z dist init r=0, world=2 2022-08-17T13:26:08.4258914Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:26:08.4259685Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:08.4286907Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:09.8138137Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:26:09.8139121Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:26:09.8391972Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:09.8392953Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:09.8394600Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:09.8395582Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:10.2505217Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.2513839Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.2544410Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:10.2546375Z warnings.warn( 2022-08-17T13:26:10.2554362Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:10.2555915Z warnings.warn( 2022-08-17T13:26:10.3155583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:10.3157116Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:10.3158940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:10.3160232Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:10.3210757Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.3211740Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.3872376Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.3873335Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.4529746Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.4531378Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.5176942Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.5177980Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.5841029Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.5843541Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.6500101Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.6500695Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.7243978Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.7244524Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.7908106Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.7910138Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.7974319Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7976125Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7978278Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7980302Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7981761Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7983041Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7984574Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7985858Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7987115Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7988370Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7989621Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7990977Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7992243Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7993492Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7994802Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7996047Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7997305Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7998571Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.7999822Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.8001074Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.8002337Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.8003589Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.8004842Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.8006194Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.8007454Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.8008695Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.8010004Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.8011259Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.8012509Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.8013766Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:10.8595409Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.8597486Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.9700828Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:10.9701381Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:11.0366537Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:11.0368612Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:11.5702320Z ok (4.814s) 2022-08-17T13:26:11.5721291Z test_nested_always_wrap_model_offload_false_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60821 2022-08-17T13:26:11.5727090Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60822 2022-08-17T13:26:13.0296786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:13.0297284Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:13.0299564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:13.0300074Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:13.0421401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:13.0422073Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:13.0425726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:13.0426219Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:13.1980716Z dist init r=1, world=2 2022-08-17T13:26:13.1985014Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:26:13.2156401Z dist init r=0, world=2 2022-08-17T13:26:13.2160993Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:26:13.2161719Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:13.2189513Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:14.5887326Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:26:14.5887852Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:26:14.6145810Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:14.6146367Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:14.6147063Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:14.6147607Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:15.0488640Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.0489171Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.0526647Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:15.0527458Z warnings.warn( 2022-08-17T13:26:15.0528583Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:15.0529341Z warnings.warn( 2022-08-17T13:26:15.1099763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:15.1100499Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:15.1102343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:15.1103002Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:15.1155702Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.1156204Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.1797139Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.1797687Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.2432841Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.2433374Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.3064352Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.3064877Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.3704460Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.3705168Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.4340859Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.4341831Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.4974846Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.4975813Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.5617721Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.5618685Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.5683328Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5685940Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5688554Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5691194Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5693848Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5696446Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5699184Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5701649Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5704487Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5707144Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5709882Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5712414Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5715018Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5717739Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5720386Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5722964Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5725602Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5728263Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5731082Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5733712Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5736228Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5739033Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5741677Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5744548Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5747081Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5749597Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5752139Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5754666Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5757113Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.5759499Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:15.6298509Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.6299480Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.7394446Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.7395462Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.8041223Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:15.8042146Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:16.2841088Z ok (4.714s) 2022-08-17T13:26:16.2859684Z test_nested_always_wrap_model_offload_true_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60904 2022-08-17T13:26:16.2866082Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60905 2022-08-17T13:26:17.7405644Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:17.7406154Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:17.7408428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:17.7408915Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:17.7476380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:17.7480732Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:17.7481321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:17.7481802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:17.9104815Z dist init r=1, world=2 2022-08-17T13:26:17.9109368Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:26:17.9198713Z dist init r=0, world=2 2022-08-17T13:26:17.9203477Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:26:17.9204287Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:17.9212364Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:19.2881851Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:26:19.2882377Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:26:19.3112592Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:19.3113171Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:19.3113869Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:19.3114409Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:19.7243909Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.7244451Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.7281702Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:19.7282586Z warnings.warn( 2022-08-17T13:26:19.7283766Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:19.7284572Z warnings.warn( 2022-08-17T13:26:19.7392050Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.7392683Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.7524645Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.7525143Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.7657185Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.7657664Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.7789647Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.7790132Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.7922148Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.7922639Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.8054677Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.8055161Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.8662238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:19.8662919Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:19.8672302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:19.8672959Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:19.8723725Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.8724221Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.9373948Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.9374427Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:19.9825469Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:26:19.9826264Z return iter(self.unbind(0)) 2022-08-17T13:26:19.9827529Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:26:19.9828306Z return iter(self.unbind(0)) 2022-08-17T13:26:20.0044241Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.0044723Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.0556616Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:20.0558240Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:20.0559504Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:20.0560989Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:20.0562249Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:20.0563503Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:20.1143135Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.1143636Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.1794203Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.1794709Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.2443223Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.2443701Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.3077441Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.3077945Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.3722614Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.3723118Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.4368762Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.4369296Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.5002988Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.5003747Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.5649001Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.5649503Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:20.5712245Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:20.5713557Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:20.5715009Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:20.5716280Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:20.5717531Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:20.5718789Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:20.5720056Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:20.5721319Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:20.5722574Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:20.5723821Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:21.0982891Z ok (4.814s) 2022-08-17T13:26:21.1003351Z test_nested_always_wrap_model_offload_true_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60987 2022-08-17T13:26:21.1009276Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60988 2022-08-17T13:26:22.5393264Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:22.5393836Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:22.5395275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:22.5395754Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:22.6205677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:22.6206409Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:22.6209130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:22.6209621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:22.7083603Z dist init r=0, world=2 2022-08-17T13:26:22.7087626Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:26:22.7933635Z dist init r=1, world=2 2022-08-17T13:26:22.7938320Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:26:22.7939216Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:22.8004242Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:24.1614402Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:26:24.1614943Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:26:24.1831198Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:24.1831760Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:24.1832470Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:24.1833021Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:24.5982580Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.5991396Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.6020932Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:24.6021725Z warnings.warn( 2022-08-17T13:26:24.6030215Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:24.6030975Z warnings.warn( 2022-08-17T13:26:24.6147829Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.6148606Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.6285453Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.6285948Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.6423530Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.6424343Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.6562386Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.6562887Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.6701470Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.6702269Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.6842359Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.6842867Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.7776892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:24.7777608Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:24.7778534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:24.7779194Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:24.7831894Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.7832574Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.8636450Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.8636921Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.9432031Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.9432528Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:24.9506663Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9508034Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9509410Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9510835Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9512555Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9513851Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9515119Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9516467Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9517734Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9518976Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9520229Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9521470Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9522724Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9523972Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9525201Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9526506Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9527764Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9529016Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9530326Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:24.9531581Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:25.0241187Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:25.0241695Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:25.1485293Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:25.1485805Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:25.2280762Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:25.2281254Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:25.3057571Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:25.3058052Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:25.3849798Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:25.3850278Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:25.4633550Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:25.4634055Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:25.5411255Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:25.5411854Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:25.6203381Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:25.6203879Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:26.2131911Z ok (5.115s) 2022-08-17T13:26:26.2151257Z test_nested_always_wrap_model_offload_true_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61070 2022-08-17T13:26:26.2157111Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61071 2022-08-17T13:26:27.7048135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:27.7048653Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:27.7050979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:27.7051487Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:27.7252907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:27.7253371Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:27.7257672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:27.7258152Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:27.8717982Z dist init r=0, world=2 2022-08-17T13:26:27.8722040Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:26:27.8989583Z dist init r=1, world=2 2022-08-17T13:26:27.8994441Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:26:27.8995243Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:27.9029134Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:29.2655363Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:26:29.2655899Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:26:29.2905588Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:29.2906141Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:29.2906864Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:29.2907417Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:29.7003120Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:29.7011618Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:29.7039724Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:29.7040523Z warnings.warn( 2022-08-17T13:26:29.7050543Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:29.7051299Z warnings.warn( 2022-08-17T13:26:29.7163425Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:29.7165852Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:29.7299366Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:29.7301640Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:29.7435892Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:29.7438473Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:29.7572423Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:29.7574372Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:29.7708799Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:29.7710834Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:29.7845205Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:29.7847268Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:29.8574635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:29.8575484Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:29.8578076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:29.8578746Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:29.8628477Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:29.8630395Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:29.9395674Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:29.9398068Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.0161177Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.0163101Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.0230383Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0231649Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0232931Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0234207Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0235469Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0236862Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0238153Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0239412Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0240758Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0242014Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0243280Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0244523Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0245775Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0247023Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0248289Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0249543Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0250798Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0252101Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0253355Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0254601Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:30.0945906Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.0946444Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.2166688Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.2167197Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.2928435Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.2929305Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.3673767Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.3674993Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.4431549Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.4433405Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.5182912Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.5184942Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.5930098Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.5931891Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.6688897Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:30.6690622Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:31.2291278Z ok (5.016s) 2022-08-17T13:26:31.2308816Z test_nested_wrapped_model_offload_false_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61153 2022-08-17T13:26:31.2314598Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61154 2022-08-17T13:26:32.6911260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:32.6911771Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:32.6913649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:32.6914138Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:32.7332271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:32.7332787Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:32.7336535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:32.7337039Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:32.8580177Z dist init r=1, world=2 2022-08-17T13:26:32.8584490Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:26:32.9048724Z dist init r=0, world=2 2022-08-17T13:26:32.9053230Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:26:32.9054226Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:32.9094056Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:34.2673197Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:26:34.2673729Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:26:34.2946410Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:34.2946984Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:34.2947688Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:34.2948210Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:34.7204004Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:34.7204559Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:34.7235087Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:34.7235856Z warnings.warn( 2022-08-17T13:26:34.7236960Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:34.7237697Z warnings.warn( 2022-08-17T13:26:34.7543904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:34.7544793Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:34.7553561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:34.7554210Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:34.7605525Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:34.7606032Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:34.7979863Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:34.7980602Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:34.8351363Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:34.8351852Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:34.8723338Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:34.8723816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:34.9098399Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:34.9098890Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:34.9471013Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:34.9471478Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:34.9845026Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:34.9845513Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:35.0221361Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:35.0221844Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:35.0595464Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:35.0595948Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:35.0970594Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:35.0971135Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:35.1137828Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1140167Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1142161Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1144173Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1146505Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1148416Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1150921Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1152227Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1154041Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1155431Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1156695Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1157988Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1159240Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1160489Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1161739Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1162996Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1164237Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1165547Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1166803Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1168053Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1169345Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1170594Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1171831Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1173086Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1174334Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1175571Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1176823Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1178071Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1179318Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1180649Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1181891Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1183130Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1184986Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1186250Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1187499Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1188754Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1190000Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1191240Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1192495Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1193722Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1194965Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1196297Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1197563Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1198814Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1200125Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1201364Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1202618Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1203863Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1205090Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1206332Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1207579Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1208824Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1210122Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1211376Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1212621Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1213919Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1215163Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1216408Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1217637Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1218875Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1220118Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1221356Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1222600Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1224023Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1225360Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1226617Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1227869Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1229165Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1230411Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1231649Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1232895Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1234146Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1235387Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1236642Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1237882Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1239187Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1372769Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:35.1373272Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:35.1955519Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1956870Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1958607Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1959975Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1961895Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1963201Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1964472Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.1965730Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:35.6424025Z ok (4.413s) 2022-08-17T13:26:35.6442067Z test_nested_wrapped_model_offload_false_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61236 2022-08-17T13:26:35.6447668Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61237 2022-08-17T13:26:37.1018926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:37.1019447Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:37.1021539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:37.1022046Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:37.1026285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:37.1026975Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:37.1030665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:37.1031150Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:37.2773882Z dist init r=1, world=2 2022-08-17T13:26:37.2777978Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:26:37.2812093Z dist init r=0, world=2 2022-08-17T13:26:37.2816697Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:26:37.2817684Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:37.2881672Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:38.6361538Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:26:38.6362404Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:26:38.6628518Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:38.6629729Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:38.6631153Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:38.6632240Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:39.0791617Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.0799815Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.0824533Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:39.0825441Z warnings.warn( 2022-08-17T13:26:39.0832248Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:39.0833010Z warnings.warn( 2022-08-17T13:26:39.1265769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:39.1266453Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:39.1267359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:39.1268020Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:39.1318552Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.1319056Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.1807459Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.1808000Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.2291121Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.2291610Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.2773573Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.2774061Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.3265431Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.3266060Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.3750364Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.3750851Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.4235647Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.4236163Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.4726311Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.4726807Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.5213646Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.5214133Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.5700579Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.5701070Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.5913052Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5914366Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5915634Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5916898Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5918168Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5919595Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5920877Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5922133Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5923467Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5924703Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5925955Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5927245Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5928507Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5929758Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5931007Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5932262Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5933510Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5934811Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5936064Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5937310Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5938591Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5939838Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5941076Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5942333Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5943965Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5945246Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5946503Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5947749Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5949080Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5950326Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5951572Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5952819Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5954143Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5955400Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5956651Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5957899Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5959139Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5960393Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5961634Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5962860Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5964162Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5965423Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5966673Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5968031Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5969273Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5970517Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5971767Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5973017Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5974247Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5975498Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5976742Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5977991Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5979297Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5980584Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5981835Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5983151Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5984585Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5985810Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5987066Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5988309Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5989562Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5990821Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5992069Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5993400Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5994667Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5995922Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5997234Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5998474Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.5999716Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.6000971Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.6002218Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.6003465Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.6004718Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.6005967Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.6217678Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.6218166Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:39.6842918Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.6844284Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.6845542Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.6846902Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.6848158Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.6849404Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.6850651Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:39.6851889Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:40.1560113Z ok (4.514s) 2022-08-17T13:26:40.1577975Z test_nested_wrapped_model_offload_false_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61319 2022-08-17T13:26:40.1584200Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61320 2022-08-17T13:26:41.5861156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:41.5861687Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:41.5863659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:41.5864143Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:41.6180413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:41.6180888Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:41.6184712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:41.6185191Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:41.7520232Z dist init r=1, world=2 2022-08-17T13:26:41.7524081Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:26:41.7909309Z dist init r=0, world=2 2022-08-17T13:26:41.7914235Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:26:41.7914991Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:41.7931984Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:43.1707831Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:26:43.1708384Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:26:43.1952893Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:43.1953782Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:43.1954506Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:43.1955051Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:43.5983942Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:43.5984731Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:43.6014543Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:43.6015365Z warnings.warn( 2022-08-17T13:26:43.6016489Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:43.6017240Z warnings.warn( 2022-08-17T13:26:43.6428926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:43.6429639Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:43.6430564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:43.6431225Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:43.6481329Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:43.6481849Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:43.7077199Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:43.7078171Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:43.7546701Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:43.7547650Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:43.8012088Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:43.8013073Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:43.8480642Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:43.8481568Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:43.8944754Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:43.8945702Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:43.9411986Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:43.9413135Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:43.9913984Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:43.9914958Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:44.0384350Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:44.0385305Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:44.0857920Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:44.0858886Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:44.1070647Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1072819Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1074370Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1076895Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1078879Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1081225Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1082486Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1083953Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1085236Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1086484Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1087823Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1089066Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1090318Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1091569Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1092812Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1094063Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1095342Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1096586Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1097884Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1099139Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1100379Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1101669Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1102904Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1105050Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1106330Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1107580Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1108825Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1110070Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1111309Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1112546Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1113871Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1115142Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1116383Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1117706Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1118953Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1120195Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1121445Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1122682Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1123914Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1125146Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1126383Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1127612Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1128922Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1130176Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1131417Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1132716Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1133955Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1135196Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1136423Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1137666Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1138909Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1140156Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1141399Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1142717Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1144741Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1146004Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1147350Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1148589Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1149815Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1151062Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1152312Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1153560Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1154807Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1156045Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1157285Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1158596Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1159851Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1161079Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1162380Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1163622Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1164866Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1166116Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1167354Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1168583Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1367495Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:44.1368437Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:44.1963100Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1964435Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1965982Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1967260Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1974827Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1976239Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1977491Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.1978737Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:44.6693380Z ok (4.513s) 2022-08-17T13:26:44.6711025Z test_nested_wrapped_model_offload_true_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61402 2022-08-17T13:26:44.6716820Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61403 2022-08-17T13:26:46.1157326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:46.1157836Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:46.1159203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:46.1159948Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:46.1287226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:46.1288003Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:46.1291780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:46.1292539Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:46.2847659Z dist init r=0, world=2 2022-08-17T13:26:46.2851383Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:26:46.3056658Z dist init r=1, world=2 2022-08-17T13:26:46.3061382Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:26:46.3062888Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:46.3158564Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:47.6546444Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:26:47.6547347Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:26:47.6792242Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:47.6793275Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:47.6794566Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:47.6795535Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:48.0845352Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.0846757Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.0877003Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:48.0878557Z warnings.warn( 2022-08-17T13:26:48.0880832Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:48.0882332Z warnings.warn( 2022-08-17T13:26:48.0968815Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.0969779Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.0971763Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.0974209Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.0975890Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.0977144Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.0978428Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.0979875Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.0981197Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.0982457Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1078504Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.1079461Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.1122950Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1125707Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1128362Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1130964Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1133515Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1136112Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1138654Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1141199Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1196137Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.1197301Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.1302238Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.1303195Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.1334831Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1337497Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1340300Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1342855Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1345803Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1348351Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1350787Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1353260Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1355953Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1358593Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1421880Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.1423037Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.1528025Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.1528982Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.1561815Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1564484Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1567191Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1569748Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1572283Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1574854Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1577375Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1580013Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1582587Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.1585398Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:48.2003562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:48.2005094Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:48.2006881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:48.2008180Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:48.2055506Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.2056458Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.2549536Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.2550712Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.3039288Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.3040257Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.3529266Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.3530271Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.4104450Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.4105439Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.4594639Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.4595621Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.4747587Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:26:48.4748402Z return iter(self.unbind(0)) 2022-08-17T13:26:48.4749534Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:26:48.4761840Z return iter(self.unbind(0)) 2022-08-17T13:26:48.5090145Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.5091115Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.6010589Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.6011447Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.6490652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.6491613Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.6972647Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.6973588Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.7455436Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:48.7456415Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:49.2825940Z ok (4.613s) 2022-08-17T13:26:49.2844223Z test_nested_wrapped_model_offload_true_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61485 2022-08-17T13:26:49.2850097Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61486 2022-08-17T13:26:50.7866774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:50.7867261Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:50.7870034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:50.7870525Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:50.8020487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:50.8020948Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:50.8025105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:50.8025597Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:50.9531465Z dist init r=0, world=2 2022-08-17T13:26:50.9535337Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:26:50.9758375Z dist init r=1, world=2 2022-08-17T13:26:50.9763005Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:26:50.9763985Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:50.9842203Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:52.3370647Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:26:52.3371183Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:26:52.3626482Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:52.3627070Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:52.3627777Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:52.3628322Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:52.7829190Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:52.7829739Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:52.7859623Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:52.7860509Z warnings.warn( 2022-08-17T13:26:52.7861623Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:52.7862372Z warnings.warn( 2022-08-17T13:26:52.7955028Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:52.7955553Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:52.8063449Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:52.8063951Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:52.8092234Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8093710Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8094988Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8096264Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8097549Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8098818Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8100078Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8101337Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8174772Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:52.8175255Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:52.8284555Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:52.8285490Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:52.8306555Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8308654Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8310098Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8311474Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8312948Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8314341Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8315712Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8317089Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8318462Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8319809Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8395685Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:52.8396705Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:52.8451660Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8453756Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:52.8507408Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:52.8508099Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:52.9063663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:52.9064787Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:52.9065757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:52.9066594Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:52.9115368Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:52.9116076Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:52.9703159Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:52.9703803Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.0285714Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.0286592Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.0866883Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.0867389Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.1451973Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.1452806Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.2033390Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.2034137Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.2211032Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:26:53.2211845Z return iter(self.unbind(0)) 2022-08-17T13:26:53.2212965Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:26:53.2213739Z return iter(self.unbind(0)) 2022-08-17T13:26:53.2621421Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.2622361Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.3647699Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.3648200Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.4222138Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.4223605Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.4794328Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.4795134Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.5371176Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:53.5371943Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:54.0964139Z ok (4.814s) 2022-08-17T13:26:54.0982265Z test_nested_wrapped_model_offload_true_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61568 2022-08-17T13:26:54.0989131Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61569 2022-08-17T13:26:55.5171692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:55.5172205Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:55.5174436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:55.5174908Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:55.5444283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:26:55.5444745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:26:55.5448671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:26:55.5449149Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:26:55.6840850Z dist init r=0, world=2 2022-08-17T13:26:55.6844560Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:26:55.7161350Z dist init r=1, world=2 2022-08-17T13:26:55.7165917Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:26:55.7166986Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:55.7252433Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:26:57.0773760Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:26:57.0774300Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:26:57.0990538Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:57.0991137Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:57.0991835Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:26:57.0992389Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:26:57.5121668Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.5130140Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.5154726Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:57.5155532Z warnings.warn( 2022-08-17T13:26:57.5163747Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:26:57.5164528Z warnings.warn( 2022-08-17T13:26:57.5262052Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.5263239Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.5377690Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.5379961Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.5409082Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5410386Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5411673Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5412937Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5414218Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5415462Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5416725Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5417967Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5496665Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.5498066Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.5612943Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.5614362Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.5637122Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5638440Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5639915Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5641187Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5642454Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5643736Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5644991Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5646256Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5647507Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5648766Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5731251Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.5732587Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.5789307Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5793414Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:26:57.5849014Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.5850747Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.6425212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:57.6425920Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:57.6428986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:26:57.6429647Z warnings.warn(msg, FutureWarning) 2022-08-17T13:26:57.6480106Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.6481180Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.7091530Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.7092970Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.7696031Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.7698163Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.8301442Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.8301978Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.8907668Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.8908855Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.9515493Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.9516575Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:57.9698018Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:26:57.9698832Z return iter(self.unbind(0)) 2022-08-17T13:26:57.9699972Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:26:57.9700738Z return iter(self.unbind(0)) 2022-08-17T13:26:58.0128567Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:58.0129432Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:58.1259301Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:58.1259834Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:58.1855456Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:58.1857416Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:58.2449710Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:58.2451157Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:58.3051799Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:58.3053821Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:26:58.8100191Z ok (4.713s) 2022-08-17T13:26:58.8120006Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61651 2022-08-17T13:26:58.8126200Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61652 2022-08-17T13:27:00.3235648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:00.3236158Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:00.3238427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:00.3238918Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:00.3247129Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:00.3247591Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:00.3251361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:00.3251819Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:00.4979984Z dist init r=1, world=2 2022-08-17T13:27:00.4983942Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:27:00.5024442Z dist init r=0, world=2 2022-08-17T13:27:00.5029456Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:27:00.5030392Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:00.5087547Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:01.8925481Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:27:01.8926064Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:27:01.9160174Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:01.9160749Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:01.9161436Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:01.9162004Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:02.3237157Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:02.3238129Z warnings.warn( 2022-08-17T13:27:02.3522781Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:02.3523811Z warnings.warn( 2022-08-17T13:27:02.3750327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:02.3751477Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:02.3752787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:02.3753436Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:02.6837812Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6839977Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6841579Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6843092Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6844380Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6845628Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6847092Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6848376Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6849635Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6850867Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6852200Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6853465Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6854727Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6855974Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6857224Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6858481Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6859733Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6860979Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6862270Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6863940Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6865207Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6866571Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6867827Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6869084Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6870338Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6871585Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6872836Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6874093Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6875338Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6876568Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6877888Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6879144Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6880394Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6881736Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6882991Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6884237Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6885488Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6886732Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6887964Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6889213Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6890456Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6891757Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6893021Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6894268Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6895573Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6896816Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6898058Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6899298Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6900544Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6901791Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6903051Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.6904456Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:02.7565112Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:775: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:27:02.7566279Z return super(Tensor, self).split_with_sizes(split_size, dim) 2022-08-17T13:27:02.7585732Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:775: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:27:02.7586562Z return super(Tensor, self).split_with_sizes(split_size, dim) 2022-08-17T13:27:03.2232697Z ok (4.413s) 2022-08-17T13:27:03.2251453Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61734 2022-08-17T13:27:03.2257587Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61735 2022-08-17T13:27:04.6779986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:04.6780489Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:04.6782122Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:04.6782900Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:04.6970347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:04.6971067Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:04.6974482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:04.6975250Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:04.8459607Z dist init r=1, world=2 2022-08-17T13:27:04.8463359Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:27:04.8703157Z dist init r=0, world=2 2022-08-17T13:27:04.8708342Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:27:04.8709862Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:04.8769639Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:06.2217461Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:27:06.2218019Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:27:06.2479226Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:06.2479821Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:06.2480521Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:06.2481061Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:06.6734686Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:06.6735537Z warnings.warn( 2022-08-17T13:27:06.6736946Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:06.6737711Z warnings.warn( 2022-08-17T13:27:06.7034480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:06.7035161Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:06.7036257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:06.7036909Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:07.0753049Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0754406Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0755690Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0756929Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0758205Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0759470Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0760722Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0761977Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0763489Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0764913Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0766275Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0767744Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0769108Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0770466Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0771816Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0773181Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0774540Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0775903Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0777262Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0778619Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0780044Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0781450Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0782816Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0784635Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0785980Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0787346Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0788712Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0790073Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0791436Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0792799Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0794160Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0795613Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0796989Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0798337Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0799798Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0801166Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0802528Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0803893Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0805249Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0806600Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0807970Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0809331Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0810671Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0812094Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0813467Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0814821Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0816242Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0817604Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0818963Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0820321Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0821685Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0823027Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0824539Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.0825899Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:07.1562103Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3243: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:27:07.1563218Z p.data = p.data[:p._unsharded_size.numel()].view(p._unsharded_size) # type: ignore[attr-defined] 2022-08-17T13:27:07.1579190Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3243: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:27:07.1580346Z p.data = p.data[:p._unsharded_size.numel()].view(p._unsharded_size) # type: ignore[attr-defined] 2022-08-17T13:27:07.6367501Z ok (4.413s) 2022-08-17T13:27:07.6388220Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61817 2022-08-17T13:27:07.6393802Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61818 2022-08-17T13:27:09.0646168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:09.0646676Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:09.0648825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:09.0649310Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:09.0946934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:09.0947398Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:09.0951411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:09.0951890Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:09.2314309Z dist init r=1, world=2 2022-08-17T13:27:09.2318434Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:27:09.2673638Z dist init r=0, world=2 2022-08-17T13:27:09.2678128Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:27:09.2679446Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:09.2727007Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:10.6335469Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:27:10.6336001Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:27:10.6558983Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:10.6559549Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:10.6560247Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:10.6560786Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:11.0786972Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:11.0787831Z warnings.warn( 2022-08-17T13:27:11.0865705Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:11.0866462Z warnings.warn( 2022-08-17T13:27:11.1166364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:11.1167187Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:11.1169877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:11.1170553Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:11.4962974Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4964352Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4965645Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4966917Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4968189Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4969449Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4970698Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4972211Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4973491Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4974765Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4976107Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4977343Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4978610Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4979876Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4981122Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4982418Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4983934Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4985200Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4986531Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4987806Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4989055Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4990282Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4991605Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4992853Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4994114Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4995371Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4996616Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4997875Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.4999123Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5000376Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5001666Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5002929Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5004175Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5005482Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5006734Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5007975Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5009229Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5010472Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5011719Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5012954Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5014201Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5015444Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5016751Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5018014Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5019262Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5020569Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5021814Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5023057Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5024492Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5025729Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5026976Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5028239Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5029490Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5030815Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:11.5800080Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3243: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:27:11.5801161Z p.data = p.data[:p._unsharded_size.numel()].view(p._unsharded_size) # type: ignore[attr-defined] 2022-08-17T13:27:11.5807834Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3243: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:27:11.5809228Z p.data = p.data[:p._unsharded_size.numel()].view(p._unsharded_size) # type: ignore[attr-defined] 2022-08-17T13:27:12.0503685Z ok (4.414s) 2022-08-17T13:27:12.0523951Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61900 2022-08-17T13:27:12.0530059Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61901 2022-08-17T13:27:13.4789895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:13.4790596Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:13.4792929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:13.4793401Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:13.5410992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:13.5411461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:13.5415471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:13.5415940Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:13.6483097Z dist init r=1, world=2 2022-08-17T13:27:13.6487002Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:27:13.7147140Z dist init r=0, world=2 2022-08-17T13:27:13.7152095Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:27:13.7153417Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:13.7200274Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:15.0843159Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:27:15.0843685Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:27:15.1078279Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:15.1078831Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:15.1079795Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:15.1080648Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:15.5270915Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:15.5271758Z warnings.warn( 2022-08-17T13:27:15.5272886Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:15.5273960Z warnings.warn( 2022-08-17T13:27:15.5462084Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5463979Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5465296Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5466565Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5467826Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5469098Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5470338Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5471599Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5639743Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5641330Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5642605Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5643981Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5645227Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5646486Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5647746Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5649010Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5650261Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5651503Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5762494Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.5764036Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:15.6124724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:15.6125609Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:15.6126548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:15.6127214Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:15.8250518Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:27:15.8251544Z return iter(self.unbind(0)) 2022-08-17T13:27:15.8252684Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:27:15.8253456Z return iter(self.unbind(0)) 2022-08-17T13:27:16.5639669Z ok (4.514s) 2022-08-17T13:27:16.5658451Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61983 2022-08-17T13:27:16.5664400Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61984 2022-08-17T13:27:18.0003385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:18.0003880Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:18.0005740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:18.0006212Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:18.0231132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:18.0231612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:18.0235582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:18.0236087Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:18.1662920Z dist init r=1, world=2 2022-08-17T13:27:18.1666944Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:27:18.1962400Z dist init r=0, world=2 2022-08-17T13:27:18.1967314Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:27:18.1968264Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:18.1973243Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:19.5859926Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:27:19.5860464Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:27:19.6155891Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:19.6156460Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:19.6157165Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:19.6157704Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:20.0245440Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:20.0246607Z warnings.warn( 2022-08-17T13:27:20.0247785Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:20.0248593Z warnings.warn( 2022-08-17T13:27:20.0448872Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0450200Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0451479Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0452745Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0454019Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0455280Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0456511Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0457894Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0632195Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0633508Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0634897Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0636152Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0637399Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0638656Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0639907Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0641161Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0642394Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0643638Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0757645Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.0758938Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:20.1194432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:20.1195121Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:20.1196196Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:20.1196855Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:20.3669079Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:27:20.3669897Z return iter(self.unbind(0)) 2022-08-17T13:27:20.3671040Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:27:20.3671822Z return iter(self.unbind(0)) 2022-08-17T13:27:21.0781318Z ok (4.514s) 2022-08-17T13:27:21.0800934Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62066 2022-08-17T13:27:21.0806502Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62067 2022-08-17T13:27:22.5350055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:22.5350576Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:22.5352333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:22.5352808Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:22.5567055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:22.5567527Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:22.5571489Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:22.5571954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:22.7026458Z dist init r=0, world=2 2022-08-17T13:27:22.7030485Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:27:22.7311360Z dist init r=1, world=2 2022-08-17T13:27:22.7316199Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:27:22.7317382Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:22.7337139Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:24.0992083Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:27:24.0992610Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:27:24.1275348Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:24.1275903Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:24.1276831Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:24.1277724Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:24.5346596Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:24.5347416Z warnings.warn( 2022-08-17T13:27:24.5428813Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:24.5429757Z warnings.warn( 2022-08-17T13:27:24.5637580Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5639130Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5640624Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5642137Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5643433Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5645148Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5646657Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5648210Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5828437Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5830155Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5831640Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5833114Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5834542Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5835829Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5837337Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5838811Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5840251Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5841796Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5958064Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.5960609Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:24.6392002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:24.6392907Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:24.6395456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:24.6396281Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:24.9011540Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:27:24.9012384Z return iter(self.unbind(0)) 2022-08-17T13:27:24.9013505Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:27:24.9014269Z return iter(self.unbind(0)) 2022-08-17T13:27:25.5922253Z ok (4.514s) 2022-08-17T13:27:25.5940203Z test_transformer_offload_false_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62149 2022-08-17T13:27:25.5946425Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62150 2022-08-17T13:27:27.1015326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:27.1015938Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:27.1020104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:27.1020954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:27.1268925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:27.1269405Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:27.1272836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:27.1273344Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:27.2786888Z dist init r=0, world=2 2022-08-17T13:27:27.2791957Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:27:27.3010182Z dist init r=1, world=2 2022-08-17T13:27:27.3014708Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:27:27.3015725Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:27.3098678Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:28.6840688Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:27:28.6841223Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:27:28.7236917Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:28.7237850Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:28.7238561Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:28.7239083Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:29.3437479Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:29.3438238Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:29.3679109Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:29.3679914Z warnings.warn( 2022-08-17T13:27:29.3684369Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:29.3685119Z warnings.warn( 2022-08-17T13:27:29.4558027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:29.4558719Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:29.4571535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:29.4572228Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:29.4878769Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:29.4879746Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:29.6263371Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:29.6264493Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:29.6464338Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.6465705Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.6466976Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.6468351Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.6469640Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.6470881Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.6472150Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.6473396Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.6474650Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.6475906Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.6477149Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.6478395Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.6479709Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.6480966Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.6482244Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.6483544Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.6484784Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.6486034Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:29.8142332Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:29.8142851Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:29.9522372Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:29.9522866Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:30.0907711Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:30.0908206Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:30.1438300Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.1439587Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.1440859Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.1442337Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.1443638Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.1444884Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.1446229Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.1447482Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.1448734Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.1449981Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.1451223Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.1452458Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.1453706Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.1454952Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.1456199Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.1457474Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.1458719Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.1459955Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.2336209Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:30.2336733Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:30.3741260Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:30.3741759Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:30.5150428Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:30.5150899Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:30.5702773Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5705692Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5708193Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5709773Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5711043Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5712301Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5713548Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5715095Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5716363Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5717620Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5719551Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5720819Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5722081Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5723323Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5724570Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5725835Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5727079Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5728332Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5729641Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.5730907Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.6583373Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:30.6584566Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:30.7994500Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:30.7995214Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:30.9403117Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:30.9403623Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:30.9953166Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9954483Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9955760Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9957031Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9958289Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9959572Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9960830Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9962086Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9963502Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9964765Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9966014Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9967366Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9968621Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9969874Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9971132Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9972386Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9973630Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9974887Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9976123Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:30.9977428Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:31.6086375Z ok (6.016s) 2022-08-17T13:27:31.6104993Z test_transformer_offload_false_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62232 2022-08-17T13:27:31.6111361Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62233 2022-08-17T13:27:33.0828529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:33.0829043Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:33.0831100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:33.0831871Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:33.0981670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:33.0982143Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:33.0986260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:33.0986753Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:33.2504232Z dist init r=0, world=2 2022-08-17T13:27:33.2508373Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:27:33.2642965Z dist init r=1, world=2 2022-08-17T13:27:33.2647449Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:27:33.2648532Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:33.2713553Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:34.6168881Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:27:34.6169471Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:27:34.6514213Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:34.6514780Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:34.6515459Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:34.6515997Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:35.2417393Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:35.2417945Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:35.2657683Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:35.2658495Z warnings.warn( 2022-08-17T13:27:35.2662570Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:35.2663991Z warnings.warn( 2022-08-17T13:27:35.3651956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:35.3652638Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:35.3653545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:35.3654335Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:35.3937588Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:35.3938096Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:35.5440903Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:35.5441564Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:35.5640707Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5642001Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5643277Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5644859Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5646547Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5647857Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5649098Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5650488Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5651770Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5653027Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5654284Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5655612Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5656854Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5658103Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5659347Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5660574Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5661830Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5663083Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5664622Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.5665961Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:35.7438357Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:35.7438870Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:35.8941060Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:35.8941819Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:36.0449333Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:36.0449813Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:36.1013811Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.1015134Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.1016606Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.1017899Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.1019385Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:775: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:27:36.1020203Z return super(Tensor, self).split_with_sizes(split_size, dim) 2022-08-17T13:27:36.1021155Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.1022429Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.1023849Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.1025257Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.1026655Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.1969827Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:36.1970333Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:36.3493034Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:36.3493680Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:36.5022272Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:36.5022795Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:36.5606562Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.5608790Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.5610120Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.5611385Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.5612641Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.5613926Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.5615164Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.5616418Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.5617869Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.5619140Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:36.6575224Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:36.6575934Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:36.8124170Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:36.8124696Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:36.9658465Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:36.9659600Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:37.0244387Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:37.0245693Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:37.0246978Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:37.0248246Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:37.0249510Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:37.0250770Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:37.0252025Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:37.0253439Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:37.0254729Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:37.0255982Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:37.6251292Z ok (6.016s) 2022-08-17T13:27:37.6269436Z test_transformer_offload_false_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62315 2022-08-17T13:27:37.6275100Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62316 2022-08-17T13:27:39.1066908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:39.1067434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:39.1069773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:39.1070252Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:39.1280657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:39.1281115Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:39.1285181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:39.1285655Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:39.2791197Z dist init r=0, world=2 2022-08-17T13:27:39.2795562Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:27:39.2946903Z dist init r=1, world=2 2022-08-17T13:27:39.2951000Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:27:39.2951934Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:39.3000631Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:40.6621037Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:27:40.6621573Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:27:40.6994299Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:40.6994877Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:40.6995581Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:40.6996103Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:41.3198620Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:41.3199161Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:41.3438543Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:41.3439431Z warnings.warn( 2022-08-17T13:27:41.3442506Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:41.3443274Z warnings.warn( 2022-08-17T13:27:41.4447400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:41.4448370Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:41.4452405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:41.4453076Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:41.4744363Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:41.4744881Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:41.6275151Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:41.6275685Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:41.6471399Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6473127Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6474445Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6475706Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6477354Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6478742Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6480448Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6481743Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6483022Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6484370Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6485609Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6486857Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6488122Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6489370Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6490634Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6491899Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6493149Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6494459Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6495720Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.6496951Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:41.8287117Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:41.8287640Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:41.9822872Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:41.9823394Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:42.1404335Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:42.1404880Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:42.1970874Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:775: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:27:42.1971807Z return super(Tensor, self).split_with_sizes(split_size, dim) 2022-08-17T13:27:42.1972801Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.1974081Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.1975340Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.1976580Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.1977835Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.1979373Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.1980642Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.1981865Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.1983246Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.2949904Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:42.2950423Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:42.4495135Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:42.4495662Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:42.6050852Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:42.6051353Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:42.6637179Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.6638483Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.6639750Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.6641023Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.6642258Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.6643535Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.6644985Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.6646276Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.6647527Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.6648870Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:42.7629980Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:42.7630495Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:42.9176635Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:42.9177140Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:43.0734813Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:43.0735311Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:43.1322090Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:43.1323396Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:43.1324672Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:43.1325949Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:43.1327199Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:43.1328463Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:43.1329896Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:43.1331181Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:43.1332433Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:43.1333764Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:43.7426183Z ok (6.117s) 2022-08-17T13:27:43.7444585Z test_transformer_offload_true_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62398 2022-08-17T13:27:43.7450593Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62399 2022-08-17T13:27:45.2394496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:45.2395028Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:45.2396943Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:45.2397437Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:45.2401088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:45.2401542Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:45.2405705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:45.2406208Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:45.4193458Z dist init r=0, world=2 2022-08-17T13:27:45.4193761Z dist init r=1, world=2 2022-08-17T13:27:45.4197175Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:27:45.4197707Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:27:45.4198476Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:45.4199448Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:46.8014109Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:27:46.8014628Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:27:46.8393442Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:46.8394033Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:46.8395014Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:46.8395577Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:47.4277865Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:47.4278389Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:47.4520566Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:47.4521669Z warnings.warn( 2022-08-17T13:27:47.4522789Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:47.4523536Z warnings.warn( 2022-08-17T13:27:47.4645428Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.4646714Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.4898326Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:47.4898850Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:47.5248610Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.5249900Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.5494013Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:47.5494494Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:47.5843859Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.5845142Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6088309Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:47.6089056Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:47.6259786Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6261467Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6262891Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6264442Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6265709Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6267214Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6268479Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6269714Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6270967Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6272212Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6273455Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6274793Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6276044Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6277308Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6278617Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6279864Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6281107Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6282353Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6283616Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6284854Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6286108Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6287351Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6288633Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6289892Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6882918Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.6884244Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.7134127Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:47.7134620Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:47.7483133Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.7484415Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.7728770Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:47.7729267Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:47.8079103Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.8080383Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.8328320Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:47.8328800Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:47.9590318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:47.9591037Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:47.9594545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:47.9595224Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:47.9640896Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9643391Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9644820Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9646482Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9647740Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9649001Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9650265Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9651514Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9652763Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9654011Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9655255Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9656563Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9657808Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9659049Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9660308Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9661638Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9662881Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9664415Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9665661Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9666905Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9668151Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9669399Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9670627Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9671960Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9673221Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9674473Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9675800Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9677041Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9678278Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9679527Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9680772Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9681997Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:47.9895257Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:47.9895781Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:48.1423163Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:48.1423680Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:48.2952155Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:48.2952676Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:48.4237793Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4239294Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4240599Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4241860Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4243213Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4244454Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4245711Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4246972Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4248222Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4249491Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4250744Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4251993Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4253289Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4254550Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4255791Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4257091Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4258313Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4259557Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4260814Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4262069Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4263680Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4264967Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4266218Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4267466Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4268811Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4270057Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4271304Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4272621Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4273882Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4275131Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.4492707Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:48.4493215Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:48.6088370Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:48.6088908Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:48.7618280Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:48.7618783Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:48.8877205Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8878547Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8879818Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8881088Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8882500Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8883942Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8885304Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8886774Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8888144Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8889496Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8890889Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8892255Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8893609Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8894980Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8896335Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8897780Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8899152Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8900507Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8901920Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8903540Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8904934Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8906300Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8907661Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8909022Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8910386Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8911748Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8913107Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8914554Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8915932Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.8917276Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:48.9131200Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:48.9133753Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:49.0634974Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:49.0635525Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:49.2138495Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:49.2139441Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:49.3393857Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3395551Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3396914Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3398423Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3400138Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3401630Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3402907Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3404334Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3405616Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3406845Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3408186Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3409433Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3410715Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3411965Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3413214Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3414461Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3415706Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3416948Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3418261Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3419497Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3420740Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3422047Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3423617Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3424899Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3426154Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3427408Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3428655Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3429907Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3431157Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3432391Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:49.3648717Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:49.3651329Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:49.5167910Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:49.5168659Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:50.1605810Z ok (6.418s) 2022-08-17T13:27:50.1624137Z test_transformer_offload_true_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62481 2022-08-17T13:27:50.1630310Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62482 2022-08-17T13:27:51.6192905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:51.6193688Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:51.6195136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:51.6195626Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:51.6465162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:51.6465623Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:51.6470109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:51.6470575Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:51.7923456Z dist init r=1, world=2 2022-08-17T13:27:51.7943268Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:27:51.8158470Z dist init r=0, world=2 2022-08-17T13:27:51.8163400Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:27:51.8164488Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:51.8251169Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:53.2034817Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:27:53.2035339Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:27:53.2391844Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:53.2392436Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:53.2393150Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:27:53.2393684Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:27:53.8349788Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:53.8369060Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:53.8590426Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:53.8591209Z warnings.warn( 2022-08-17T13:27:53.8616519Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:27:53.8617302Z warnings.warn( 2022-08-17T13:27:53.8745476Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:53.8746774Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:53.8991087Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:53.8994435Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:53.9359949Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:53.9361270Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:53.9612375Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:53.9612896Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:53.9977890Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:53.9979182Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0222835Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:54.0228394Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:54.0396386Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0397657Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0399202Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0400483Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0401740Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0403091Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0404339Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0405571Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0406844Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0408100Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0409338Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0410592Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0411836Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0413083Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0414382Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0415636Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0416872Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0418162Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0419399Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0420651Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0421913Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0423157Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0424562Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.0425817Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.1044499Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.1046074Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.1293136Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:54.1295232Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:54.1654065Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.1655337Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.1899415Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:54.1902921Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:54.2261189Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.2262474Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.2506092Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:54.2509660Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:54.3134400Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:27:54.3135180Z return iter(self.unbind(0)) 2022-08-17T13:27:54.3136313Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:27:54.3137068Z return iter(self.unbind(0)) 2022-08-17T13:27:54.4095633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:54.4096306Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:54.4099309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:27:54.4099981Z warnings.warn(msg, FutureWarning) 2022-08-17T13:27:54.4384111Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:54.4389221Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:54.6242094Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:54.6245935Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:54.8089035Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:54.8093643Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:54.8292814Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.8294375Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.8295752Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.8297103Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.8298479Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:54.9938712Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:54.9942785Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:55.1870919Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:55.1874472Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:55.3724606Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:55.3728452Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:55.5531627Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:55.5535998Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:55.5735379Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:55.5736668Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:55.5738143Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:55.5739438Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:55.7355823Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:55.7359448Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:55.9174819Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:55.9178660Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:56.0983052Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:56.0986860Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:56.2916310Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:56.2917544Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:27:56.3123148Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:56.3124464Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:56.3125738Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:56.3126976Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:27:56.9787889Z ok (6.818s) 2022-08-17T13:27:56.9806389Z test_transformer_offload_true_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62564 2022-08-17T13:27:56.9812187Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62565 2022-08-17T13:27:58.4649622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:58.4650127Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:58.4652169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:58.4652640Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:58.4762990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:27:58.4763473Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:27:58.4768287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:27:58.4768803Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:27:58.6319186Z dist init r=0, world=2 2022-08-17T13:27:58.6322867Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:27:58.6494388Z dist init r=1, world=2 2022-08-17T13:27:58.6499309Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:27:58.6500467Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:27:58.6527480Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:00.0219614Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:28:00.0220175Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:28:00.0596451Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:00.0597050Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:00.0597745Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:00.0598293Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:00.6806982Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:00.6824607Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:00.7055034Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:28:00.7055808Z warnings.warn( 2022-08-17T13:28:00.7073725Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:28:00.7074464Z warnings.warn( 2022-08-17T13:28:00.7203032Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.7205000Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.7449782Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:00.7463417Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:00.7827688Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.7829699Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8072901Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:00.8083157Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:00.8446909Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8448799Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8692377Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:00.8704070Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:00.8859944Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8861249Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8862526Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8864050Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8865328Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8866582Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8867834Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8869214Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8870474Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8871730Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8873076Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8874329Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8881642Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8882913Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8884208Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8885462Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8886721Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8887974Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8889352Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8890629Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8891880Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8893194Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8894441Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.8895684Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.9551991Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.9553305Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:00.9803474Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:00.9812930Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:01.0177372Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:01.0178947Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:01.0423389Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:01.0435608Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:01.0800309Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:01.0801840Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:01.1045577Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:01.1056747Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:01.1686979Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:28:01.1688077Z return iter(self.unbind(0)) 2022-08-17T13:28:01.1689305Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:912: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:232.) 2022-08-17T13:28:01.1690136Z return iter(self.unbind(0)) 2022-08-17T13:28:01.2666465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:28:01.2667178Z warnings.warn(msg, FutureWarning) 2022-08-17T13:28:01.2676662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T13:28:01.2677341Z warnings.warn(msg, FutureWarning) 2022-08-17T13:28:01.2963591Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:01.2976590Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:01.4857086Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:01.4865147Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:01.6763787Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:01.6772032Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:01.6978977Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:01.6980332Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:01.6981943Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:01.6983596Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:01.6984999Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:01.8653553Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:01.8663194Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:02.0542841Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:02.0553224Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:02.2428130Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:02.2438347Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:02.4270233Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:02.4278944Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:02.4482828Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:02.4484231Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:02.4485490Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:02.4486747Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:02.6149446Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:02.6156660Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:02.8016791Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:02.8026951Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:02.9865423Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:02.9876765Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:03.1721437Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:03.1730576Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:28:03.1941943Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:03.1943622Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:03.1944933Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:03.1946338Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:03.8971845Z ok (6.918s) 2022-08-17T13:28:03.8972276Z 2022-08-17T13:28:03.8972977Z ---------------------------------------------------------------------- 2022-08-17T13:28:03.8973396Z Ran 59 tests in 329.303s 2022-08-17T13:28:03.8973578Z 2022-08-17T13:28:03.8973700Z OK (skipped=5) 2022-08-17T13:28:03.8973863Z 2022-08-17T13:28:03.8973995Z Generating XML reports... 2022-08-17T13:28:03.9026197Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestHooks-20220817132234.xml 2022-08-17T13:28:03.9031380Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestNoGrad-20220817132234.xml 2022-08-17T13:28:03.9036186Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParamInit-20220817132234.xml 2022-08-17T13:28:03.9090950Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParityWithDDP-20220817132234.xml 2022-08-17T13:28:04.2485951Z Running distributed/fsdp/test_fsdp_mixed_precision ... [2022-08-17 13:28:04.248101] 2022-08-17T13:28:04.2486899Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_mixed_precision.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:28:04.248178] 2022-08-17T13:28:07.3305636Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_mixed_precision 2022-08-17T13:28:07.3329322Z 2022-08-17T13:28:07.3329577Z Running tests... 2022-08-17T13:28:07.3330008Z ---------------------------------------------------------------------- 2022-08-17T13:28:07.3484178Z test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62682 2022-08-17T13:28:07.3491642Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62683 2022-08-17T13:28:08.7785408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:08.7785923Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:08.7788431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:08.7788896Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:08.7956952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:08.7957412Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:08.7961929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:08.7962416Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:10.4803040Z dist init r=0, world=2 2022-08-17T13:28:10.4807046Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:28:10.5069378Z dist init r=1, world=2 2022-08-17T13:28:10.5074423Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:28:10.5075615Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:10.5114183Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:11.5655752Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:28:11.5656646Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:28:11.5925621Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:11.5926551Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:11.5927249Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:11.5927798Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:12.1818607Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:12.1819991Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:12.1821265Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:12.1822544Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:12.9629888Z ok (5.630s) 2022-08-17T13:28:12.9648424Z test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62766 2022-08-17T13:28:12.9654083Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62767 2022-08-17T13:28:14.3959337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:14.3959885Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:14.3961572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:14.3962369Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:14.4284433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:14.4285011Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:14.4288584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:14.4289361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:16.1051938Z dist init r=1, world=2 2022-08-17T13:28:16.1056067Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:28:16.1486749Z dist init r=0, world=2 2022-08-17T13:28:16.1492019Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:28:16.1493460Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:16.1566037Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:17.2022642Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:28:17.2023825Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:28:17.2286907Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:17.2287655Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:17.2288369Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:17.2288924Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:17.8301069Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:17.8302419Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:17.8303990Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:17.8305283Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:18.5793246Z ok (5.616s) 2022-08-17T13:28:18.5810812Z test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62850 2022-08-17T13:28:18.5816389Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62851 2022-08-17T13:28:20.0367677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:20.0368196Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:20.0370601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:20.0371112Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:20.0547375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:20.0547832Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:20.0552376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:20.0552860Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:21.7703920Z dist init r=0, world=2 2022-08-17T13:28:21.7708489Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:28:21.7920518Z dist init r=1, world=2 2022-08-17T13:28:21.7925500Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:28:21.7926324Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:21.8015529Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:22.8623573Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:28:22.8624099Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:28:22.8887246Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:22.8887809Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:22.8888507Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:22.8889063Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:23.5038784Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:23.5040129Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:23.5041433Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:23.5042705Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:24.2960098Z ok (5.717s) 2022-08-17T13:28:24.2979536Z test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62934 2022-08-17T13:28:24.2986279Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62935 2022-08-17T13:28:25.7292436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:25.7293145Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:25.7294859Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:25.7295347Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:25.7604789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:25.7605283Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:25.7609446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:25.7610238Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:27.4368069Z dist init r=0, world=2 2022-08-17T13:28:27.4372371Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:28:27.4796623Z dist init r=1, world=2 2022-08-17T13:28:27.4801823Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:28:27.4802617Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:27.4882762Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:28.5365094Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:28:28.5365630Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:28:28.5608042Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:28.5608637Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:28.5609343Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:28.5609893Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:29.1454235Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:29.1455607Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:29.1456910Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:29.1458187Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:29.9120559Z ok (5.616s) 2022-08-17T13:28:29.9138501Z test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63018 2022-08-17T13:28:29.9144783Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63019 2022-08-17T13:28:31.3944124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:31.3944604Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:31.3947940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:31.3948434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:31.4032887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:31.4033336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:31.4037464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:31.4037956Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:33.1360069Z dist init r=1, world=2 2022-08-17T13:28:33.1363853Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:28:33.1543370Z dist init r=0, world=2 2022-08-17T13:28:33.1548681Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:28:33.1549504Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:33.1569542Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:34.1936247Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:28:34.1936783Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:28:34.2213649Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:34.2214218Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:34.2243389Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:34.2243960Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:34.8661228Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:34.8662594Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:34.8664324Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:34.8665602Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:35.6283308Z ok (5.716s) 2022-08-17T13:28:35.6301553Z test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63102 2022-08-17T13:28:35.6307794Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63103 2022-08-17T13:28:37.0646660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:37.0648986Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:37.0649627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:37.0650114Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:37.0880904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:37.0881439Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:37.0885454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:37.0886110Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:38.7718245Z dist init r=1, world=2 2022-08-17T13:28:38.7722504Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:28:38.7836103Z dist init r=0, world=2 2022-08-17T13:28:38.7841417Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:28:38.7842202Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:38.7927398Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:39.8512677Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:28:39.8513260Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:28:39.8810485Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:39.8811447Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:39.8812137Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:39.8812684Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:40.5079860Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:40.5081247Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:40.5082526Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:40.5084098Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:41.2443543Z ok (5.616s) 2022-08-17T13:28:41.2461685Z test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63186 2022-08-17T13:28:41.2467557Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63187 2022-08-17T13:28:42.7321151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:42.7321675Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:42.7323287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:42.7324228Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:42.7606922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:42.7607689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:42.7611831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:42.7612585Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:44.4548668Z dist init r=0, world=2 2022-08-17T13:28:44.4552715Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:28:44.4782469Z dist init r=1, world=2 2022-08-17T13:28:44.4788133Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:28:44.4789443Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:44.4860513Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:45.5339199Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:28:45.5339945Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:28:45.5654131Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:45.5654797Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:45.5655515Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:45.5656077Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:46.1964325Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:46.1965693Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:46.1966985Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:46.1968542Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:46.9607333Z ok (5.716s) 2022-08-17T13:28:46.9625781Z test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63270 2022-08-17T13:28:46.9631913Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63271 2022-08-17T13:28:48.3692754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:48.3693580Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:48.3695154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:48.3695640Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:48.4214539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:48.4215130Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:48.4219441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:48.4220067Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:50.0659373Z dist init r=0, world=2 2022-08-17T13:28:50.0662932Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:28:50.1226044Z dist init r=1, world=2 2022-08-17T13:28:50.1232390Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:28:50.1233309Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:50.1274999Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:51.1946700Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:28:51.1947281Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:28:51.2210557Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:51.2211131Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:51.2211833Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:51.2212389Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:51.8236956Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:51.8238385Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:51.8239971Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:51.8241266Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:52.5767252Z ok (5.616s) 2022-08-17T13:28:52.5785093Z test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63354 2022-08-17T13:28:52.5791328Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63355 2022-08-17T13:28:54.0127535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:54.0128086Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:54.0130111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:54.0130599Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:54.0325295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:54.0325767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:54.0329746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:54.0330230Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:55.7280105Z dist init r=0, world=2 2022-08-17T13:28:55.7283993Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:28:55.7333129Z dist init r=1, world=2 2022-08-17T13:28:55.7338050Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:28:55.7339230Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:55.7387621Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:28:56.7926168Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:28:56.7926688Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:28:56.8207097Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:56.8207732Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:56.8208463Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:28:56.8208990Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:28:57.4224488Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:57.4225867Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:57.4227397Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:57.4228694Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:28:58.1922402Z ok (5.615s) 2022-08-17T13:28:58.1940174Z test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63438 2022-08-17T13:28:58.1946743Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63439 2022-08-17T13:28:59.6481199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:59.6483100Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:59.6483879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:59.6484373Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:28:59.6595065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:28:59.6595546Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:28:59.6600013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:28:59.6600539Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:01.3646500Z dist init r=0, world=2 2022-08-17T13:29:01.3650520Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:29:01.3905982Z dist init r=1, world=2 2022-08-17T13:29:01.3911103Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:29:01.3912126Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:01.3957515Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:02.4581344Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:29:02.4581875Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:29:02.4847874Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:02.4848442Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:02.4849160Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:02.4849711Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:03.0823593Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:03.0825323Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:03.0826638Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:03.0827897Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:03.9082794Z ok (5.716s) 2022-08-17T13:29:03.9100086Z test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63522 2022-08-17T13:29:03.9106833Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63523 2022-08-17T13:29:05.3806291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:05.3806804Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:05.3808955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:05.3809430Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:05.4303951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:05.4304445Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:05.4308806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:05.4309278Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:07.0629191Z dist init r=0, world=2 2022-08-17T13:29:07.0633021Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:29:07.1415507Z dist init r=1, world=2 2022-08-17T13:29:07.1420760Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:29:07.1421688Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:07.1447551Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:08.2092106Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:29:08.2092656Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:29:08.2366139Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:08.2366761Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:08.2367470Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:08.2368015Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:08.8258928Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:08.8260556Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:08.8261880Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:08.8263147Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:09.6244342Z ok (5.716s) 2022-08-17T13:29:09.6262181Z test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63606 2022-08-17T13:29:09.6268118Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63607 2022-08-17T13:29:11.1093047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:11.1093652Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:11.1096135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:11.1096928Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:11.1218867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:11.1219637Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:11.1223434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:11.1224400Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:12.8361863Z dist init r=1, world=2 2022-08-17T13:29:12.8366016Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:29:12.8656871Z dist init r=0, world=2 2022-08-17T13:29:12.8661928Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:29:12.8663672Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:12.8672699Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:13.9104271Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:29:13.9105005Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:29:13.9367577Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:13.9368384Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:13.9369097Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:13.9369646Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:14.5176387Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:14.5177784Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:14.5179058Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:14.5180431Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:15.2401510Z ok (5.616s) 2022-08-17T13:29:15.2418280Z test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63690 2022-08-17T13:29:15.2424253Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63691 2022-08-17T13:29:16.6925077Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:16.6925608Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:16.6928130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:16.6928627Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:16.7073053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:16.7073521Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:16.7077741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:16.7078240Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:18.4119773Z dist init r=0, world=2 2022-08-17T13:29:18.4124338Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:29:18.4166643Z dist init r=1, world=2 2022-08-17T13:29:18.4171692Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:29:18.4172917Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:18.4228352Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:19.4651358Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:29:19.4651937Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:29:19.4930828Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:19.4931413Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:19.4932101Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:19.4932679Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:20.1573152Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:20.1574531Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:20.1575813Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:20.1577203Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:20.9559886Z ok (5.716s) 2022-08-17T13:29:20.9577715Z test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63774 2022-08-17T13:29:20.9583804Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63775 2022-08-17T13:29:22.4503350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:22.4504106Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:22.4506197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:22.4506690Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:22.4627537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:22.4628010Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:22.4632434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:22.4632924Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:24.2003278Z dist init r=1, world=2 2022-08-17T13:29:24.2006359Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:29:24.2051051Z dist init r=0, world=2 2022-08-17T13:29:24.2056578Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:29:24.2057736Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:24.2109705Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:25.2690291Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:29:25.2690828Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:29:25.2967034Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:25.2967603Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:25.2969150Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:25.2969772Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:25.9237277Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:25.9238601Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:25.9240097Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:25.9241353Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:26.6721215Z ok (5.716s) 2022-08-17T13:29:26.6738390Z test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63858 2022-08-17T13:29:26.6744447Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63859 2022-08-17T13:29:28.1079590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:28.1080079Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:28.1085813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:28.1086533Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:28.1324343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:28.1324794Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:28.1328929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:28.1329420Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:29.7899635Z dist init r=0, world=2 2022-08-17T13:29:29.7904033Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:29:29.8326190Z dist init r=1, world=2 2022-08-17T13:29:29.8331056Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:29:29.8332145Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:29.8414038Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:30.8960718Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:29:30.8961253Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:29:30.9210412Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:30.9211294Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:30.9212016Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:30.9212553Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:31.5414173Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:31.5415528Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:31.5417081Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:31.5418356Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:32.2880302Z ok (5.616s) 2022-08-17T13:29:32.2898575Z test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63942 2022-08-17T13:29:32.2904708Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63943 2022-08-17T13:29:33.7284024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:33.7284534Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:33.7287018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:33.7287518Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:33.7655463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:33.7655921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:33.7660122Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:33.7660616Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:35.4083530Z dist init r=1, world=2 2022-08-17T13:29:35.4087813Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:29:35.4679667Z dist init r=0, world=2 2022-08-17T13:29:35.4684928Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:29:35.4686139Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:35.4698985Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:36.5012182Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:29:36.5012714Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:29:36.5289760Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:36.5290356Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:36.5291065Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:36.5291614Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:37.1416047Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:37.1417774Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:37.1419051Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:37.1420324Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:37.9041898Z ok (5.616s) 2022-08-17T13:29:37.9060540Z test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64026 2022-08-17T13:29:37.9066928Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64027 2022-08-17T13:29:39.3368792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:39.3369585Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:39.3371276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:39.3371741Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:39.3769046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:39.3769514Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:39.3773202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:39.3773689Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:41.0291731Z dist init r=1, world=2 2022-08-17T13:29:41.0295405Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:29:41.0476438Z dist init r=0, world=2 2022-08-17T13:29:41.0481675Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:29:41.0482486Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:41.0499836Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:42.0891449Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:29:42.0892289Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:29:42.1156725Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:42.1157525Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:42.1158221Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:42.1158771Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:42.6265138Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:42.6266804Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:42.6268052Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:42.6269330Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:42.7877008Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:42.7879511Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:42.7880821Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:42.7882065Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:43.4202072Z ok (5.516s) 2022-08-17T13:29:43.4220508Z test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64110 2022-08-17T13:29:43.4226748Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64111 2022-08-17T13:29:44.8558527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:44.8559243Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:44.8560638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:44.8561113Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:44.8787786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:44.8788231Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:44.8792580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:44.8793049Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:46.5532381Z dist init r=0, world=2 2022-08-17T13:29:46.5536350Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:29:46.5976745Z dist init r=1, world=2 2022-08-17T13:29:46.5982059Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:29:46.5983195Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:46.6046537Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:47.6638385Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:29:47.6638920Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:29:47.6879613Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:47.6880219Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:47.6880925Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:47.6881464Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:48.2525978Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:48.2527431Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:48.2528745Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:48.2530010Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:48.4010439Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:48.4011997Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:48.4013296Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:48.4014549Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:49.0360790Z ok (5.616s) 2022-08-17T13:29:49.0378935Z test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64194 2022-08-17T13:29:49.0385044Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64195 2022-08-17T13:29:50.4644646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:50.4647009Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:50.4647603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:50.4648103Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:50.4997012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:50.4997485Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:50.5001768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:50.5002259Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:52.1483453Z dist init r=0, world=2 2022-08-17T13:29:52.1487050Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:29:52.2081099Z dist init r=1, world=2 2022-08-17T13:29:52.2086187Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:29:52.2087316Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:52.2098439Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:53.2796465Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:29:53.2796990Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:29:53.3040453Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:53.3041043Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:53.3041750Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:53.3042294Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:53.8393408Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:53.8394808Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:53.8396336Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:53.8398013Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:53.9997466Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:53.9999002Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:54.0000283Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:54.0001542Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:54.6519748Z ok (5.616s) 2022-08-17T13:29:54.6538701Z test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64278 2022-08-17T13:29:54.6544600Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64279 2022-08-17T13:29:56.0490593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:56.0491586Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:56.0492774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:56.0493720Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:56.0814127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:29:56.0815071Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:29:56.0820026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:29:56.0821047Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:29:57.7483699Z dist init r=1, world=2 2022-08-17T13:29:57.7487935Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:29:57.8151310Z dist init r=0, world=2 2022-08-17T13:29:57.8157560Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:29:57.8159068Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:57.8200542Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:29:58.8600949Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:29:58.8602053Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:29:58.8839718Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:58.8840871Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:58.8842224Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:29:58.8843182Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:29:59.3965641Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:59.3968231Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:59.3970786Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:59.3973393Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:59.5446898Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:59.5449550Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:59.5451883Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:29:59.5454596Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:00.1676300Z ok (5.516s) 2022-08-17T13:30:00.1697078Z test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64362 2022-08-17T13:30:00.1702574Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64363 2022-08-17T13:30:01.6188337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:01.6188837Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:01.6191422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:01.6191907Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:01.6496699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:01.6497161Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:01.6501222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:01.6501698Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:03.3296946Z dist init r=1, world=2 2022-08-17T13:30:03.3300428Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:30:03.3743497Z dist init r=0, world=2 2022-08-17T13:30:03.3749324Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:30:03.3750411Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:03.3810368Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:04.4221289Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:30:04.4221809Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:30:04.4479825Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:04.4480389Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:04.4481068Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:04.4481631Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:05.0074725Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:05.0076067Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:05.0077356Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:05.0078896Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:05.2281664Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:05.2283083Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:05.2284729Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:05.2286088Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:05.8840368Z ok (5.716s) 2022-08-17T13:30:05.8858238Z test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64446 2022-08-17T13:30:05.8863962Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64447 2022-08-17T13:30:07.3105073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:07.3105560Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:07.3108009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:07.3108494Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:07.3492229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:07.3492683Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:07.3496706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:07.3497189Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:09.0078591Z dist init r=1, world=2 2022-08-17T13:30:09.0083117Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:30:09.0623064Z dist init r=0, world=2 2022-08-17T13:30:09.0628404Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:30:09.0629448Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:09.0694726Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:10.0969844Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:30:10.0970409Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:30:10.1240660Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:10.1241249Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:10.1241937Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:10.1242477Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:10.6379602Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:10.6381364Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:10.6382725Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:10.6384952Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:10.8096758Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:10.8098083Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:10.8099349Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:10.8100618Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:11.5002602Z ok (5.616s) 2022-08-17T13:30:11.5020378Z test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64530 2022-08-17T13:30:11.5026475Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64531 2022-08-17T13:30:12.9540681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:12.9541251Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:12.9543104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:12.9543844Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:12.9709224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:12.9709668Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:12.9713570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:12.9714042Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:14.6771924Z dist init r=0, world=2 2022-08-17T13:30:14.6775870Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:30:14.6856290Z dist init r=1, world=2 2022-08-17T13:30:14.6861096Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:30:14.6862234Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:14.6878930Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:15.7399417Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:30:15.7399923Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:30:15.7640130Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:15.7640686Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:15.7641386Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:15.7641930Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:16.3368975Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:16.3370289Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:16.3371588Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:16.3372845Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:16.5465511Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:16.5467001Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:16.5468402Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:16.5469763Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:17.2162469Z ok (5.716s) 2022-08-17T13:30:17.2180495Z test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64614 2022-08-17T13:30:17.2186796Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64615 2022-08-17T13:30:18.6580051Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:18.6580559Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:18.6582515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:18.6582997Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:18.6760699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:18.6761160Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:18.6765360Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:18.6765834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:20.3669245Z dist init r=0, world=2 2022-08-17T13:30:20.3673009Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:30:20.3925494Z dist init r=1, world=2 2022-08-17T13:30:20.3930230Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:30:20.3931319Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:20.3979646Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:21.4679851Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:30:21.4680377Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:30:21.4920209Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:21.4920756Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:21.4921455Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:21.4921991Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:22.0105282Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:22.0106985Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:22.0108289Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:22.0109552Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:22.1793997Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:22.1795331Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:22.1796589Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:22.1797866Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:22.8326775Z ok (5.616s) 2022-08-17T13:30:22.8344004Z test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64698 2022-08-17T13:30:22.8350121Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64699 2022-08-17T13:30:24.2692974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:24.2693484Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:24.2695789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:24.2696266Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:24.2950973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:24.2951434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:24.2955485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:24.2955944Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:25.9756534Z dist init r=0, world=2 2022-08-17T13:30:25.9759934Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:30:26.0146354Z dist init r=1, world=2 2022-08-17T13:30:26.0152255Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:30:26.0153069Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:26.0167965Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:27.0777482Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:30:27.0778024Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:30:27.1048348Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:27.1049252Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:27.1080548Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:27.1081112Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:27.6941147Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:27.6942507Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:27.6944074Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:27.6945347Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:28.4484189Z ok (5.616s) 2022-08-17T13:30:28.4503015Z test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64782 2022-08-17T13:30:28.4509333Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64783 2022-08-17T13:30:29.8866806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:29.8867294Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:29.8869898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:29.8870385Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:29.9124951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:29.9125402Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:29.9129626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:29.9130130Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:31.6155054Z dist init r=1, world=2 2022-08-17T13:30:31.6159602Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:30:31.6308775Z dist init r=0, world=2 2022-08-17T13:30:31.6313579Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:30:31.6314376Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:31.6364121Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:32.6715641Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:30:32.6716172Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:30:32.6967882Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:32.6968462Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:32.6969165Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:32.6969706Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:33.2759347Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:33.2760813Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:33.2762111Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:33.2763382Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:34.0646220Z ok (5.616s) 2022-08-17T13:30:34.0665398Z test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64866 2022-08-17T13:30:34.0671959Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64867 2022-08-17T13:30:35.5403424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:35.5404014Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:35.5406051Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:35.5406525Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:35.5411047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:35.5411526Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:35.5415710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:35.5416435Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:37.2788125Z dist init r=1, world=2 2022-08-17T13:30:37.2793072Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:30:37.2876636Z dist init r=0, world=2 2022-08-17T13:30:37.2881383Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:30:37.2882673Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:37.2895609Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:38.3326812Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:30:38.3327595Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:30:38.3609318Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:38.3610153Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:38.3610847Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:38.3611395Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:39.0145478Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:39.0146828Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:39.0148110Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:39.0149380Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:39.7810829Z ok (5.716s) 2022-08-17T13:30:39.7829386Z test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64950 2022-08-17T13:30:39.7835170Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64951 2022-08-17T13:30:41.2342829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:41.2343635Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:41.2345099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:41.2345580Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:41.2979079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:41.2979909Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:41.2983108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:41.2983586Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:42.9398393Z dist init r=1, world=2 2022-08-17T13:30:42.9402513Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:30:43.0054629Z dist init r=0, world=2 2022-08-17T13:30:43.0059549Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:30:43.0060779Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:43.0115544Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:44.0582387Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:30:44.0582912Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:30:44.0847177Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:44.0847754Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:44.0848450Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:44.0848989Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:44.6946535Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:44.6948055Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:44.6949327Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:44.6950592Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:45.3974166Z ok (5.616s) 2022-08-17T13:30:45.3993233Z test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65034 2022-08-17T13:30:45.3999216Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65035 2022-08-17T13:30:46.8508305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:46.8508803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:46.8511440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:46.8511930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:46.8659263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:46.8659745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:46.8663714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:46.8664193Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:48.5770867Z dist init r=0, world=2 2022-08-17T13:30:48.5774140Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:30:48.5814278Z dist init r=1, world=2 2022-08-17T13:30:48.5819788Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:30:48.5820814Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:48.5877561Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:49.6614993Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:30:49.6615511Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:30:49.6929148Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:49.6929725Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:49.6930428Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:49.6930973Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:50.3181059Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:50.3182397Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:50.3183926Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:50.3185218Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:51.1135925Z ok (5.716s) 2022-08-17T13:30:51.1154736Z test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65118 2022-08-17T13:30:51.1160917Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65119 2022-08-17T13:30:52.5459158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:52.5459666Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:52.5462098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:52.5462607Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:52.5990758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:52.5991202Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:52.5994989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:52.5995469Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:54.2490929Z dist init r=1, world=2 2022-08-17T13:30:54.2495259Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:30:54.3115009Z dist init r=0, world=2 2022-08-17T13:30:54.3119803Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:30:54.3120724Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:54.3210986Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:55.3534446Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:30:55.3534965Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:30:55.3810873Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:55.3811447Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:55.3812173Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:30:55.3812703Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:30:55.9866649Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:55.9867984Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:55.9869274Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:55.9870542Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:30:56.7315377Z ok (5.618s) 2022-08-17T13:30:56.7332924Z test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65202 2022-08-17T13:30:56.7338987Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65203 2022-08-17T13:30:58.1888060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:58.1888655Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:58.1890729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:58.1891217Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:58.2103170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:30:58.2103862Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:30:58.2108164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:30:58.2108813Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:30:59.8819244Z dist init r=0, world=2 2022-08-17T13:30:59.8823427Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:30:59.9270851Z dist init r=1, world=2 2022-08-17T13:30:59.9276219Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:30:59.9277014Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:30:59.9333145Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:00.9991471Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:31:00.9992009Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:31:01.0253048Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:01.0254201Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:01.0255561Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:01.0256647Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:01.6684653Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:01.6687344Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:01.6689956Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:01.6692455Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:02.4476807Z ok (5.716s) 2022-08-17T13:31:02.4494299Z test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65286 2022-08-17T13:31:02.4500730Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65287 2022-08-17T13:31:03.9146510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:03.9147012Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:03.9149484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:03.9149976Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:03.9396996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:03.9397458Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:03.9401824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:03.9402323Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:05.6125142Z dist init r=1, world=2 2022-08-17T13:31:05.6129448Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:31:05.6285471Z dist init r=0, world=2 2022-08-17T13:31:05.6290577Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:31:05.6291416Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:05.6334090Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:06.6616720Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:31:06.6617267Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:31:06.6890198Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:06.6890831Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:06.6891537Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:06.6892080Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:07.3160123Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:07.3161480Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:07.3162764Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:07.3164041Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:08.0637387Z ok (5.616s) 2022-08-17T13:31:08.0655638Z test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65370 2022-08-17T13:31:08.0661050Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65371 2022-08-17T13:31:09.5692291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:09.5692778Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:09.5695019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:09.5695510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:09.6217508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:09.6217979Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:09.6222030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:09.6222519Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:11.3065264Z dist init r=1, world=2 2022-08-17T13:31:11.3069857Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:31:11.3520826Z dist init r=0, world=2 2022-08-17T13:31:11.3525879Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:31:11.3526704Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:11.3579905Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:12.4163900Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:31:12.4164434Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:31:12.4397506Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:12.4398075Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:12.4398777Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:12.4399320Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:12.9482405Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:12.9483752Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:12.9485036Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:12.9486579Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:13.0983963Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:13.0985633Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:13.0987211Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:13.0988479Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:13.7798434Z ok (5.716s) 2022-08-17T13:31:13.7816195Z test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65454 2022-08-17T13:31:13.7821873Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65455 2022-08-17T13:31:15.2964980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:15.2965498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:15.2968185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:15.2968662Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:15.3005127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:15.3005570Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:15.3009647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:15.3010122Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:17.0759942Z dist init r=1, world=2 2022-08-17T13:31:17.0764319Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:31:17.0782795Z dist init r=0, world=2 2022-08-17T13:31:17.0788166Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:31:17.0788944Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:17.0869550Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:18.1368812Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:31:18.1369338Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:31:18.1679451Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:18.1680035Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:18.1681052Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:18.1681668Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:18.6842888Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:18.6844232Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:18.6845821Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:18.6847071Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:18.8440743Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:18.8442086Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:18.8443344Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:18.8444606Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:19.4961358Z ok (5.716s) 2022-08-17T13:31:19.4979857Z test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65538 2022-08-17T13:31:19.4986165Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65539 2022-08-17T13:31:20.9326514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:20.9327219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:20.9328688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:20.9329186Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:20.9658881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:20.9659339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:20.9663815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:20.9664294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:22.6298505Z dist init r=0, world=2 2022-08-17T13:31:22.6302568Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:31:22.6918610Z dist init r=1, world=2 2022-08-17T13:31:22.6923757Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:31:22.6924551Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:22.7016408Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:23.7577346Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:31:23.7577866Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:31:23.7837179Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:23.7837744Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:23.7838426Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:23.7838971Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:24.3143785Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:24.3145161Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:24.3146453Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:24.3147731Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:24.4689396Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:24.4690736Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:24.4692322Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:24.4693622Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:25.1121359Z ok (5.616s) 2022-08-17T13:31:25.1138811Z test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65622 2022-08-17T13:31:25.1144936Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65623 2022-08-17T13:31:26.5829163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:26.5829689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:26.5832101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:26.5832578Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:26.5852386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:26.5852842Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:26.5857490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:26.5857976Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:28.3405712Z dist init r=0, world=2 2022-08-17T13:31:28.3411178Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:31:28.3456252Z dist init r=1, world=2 2022-08-17T13:31:28.3461002Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:31:28.3461783Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:28.3513818Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:29.4084869Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:31:29.4085397Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:31:29.4361821Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:29.4362399Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:29.4363106Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:29.4363648Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:29.9662793Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:29.9664630Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:29.9665919Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:29.9667185Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:30.1168825Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:30.1170158Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:30.1171421Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:30.1172694Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:30.7279406Z ok (5.616s) 2022-08-17T13:31:30.7296129Z test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp32_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65706 2022-08-17T13:31:30.7302187Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65707 2022-08-17T13:31:32.1996445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:32.1996952Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:32.1999027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:32.1999521Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:32.2436284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:32.2436758Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:32.2440307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:32.2440775Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:33.8955378Z dist init r=1, world=2 2022-08-17T13:31:33.8958966Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:31:33.9638563Z dist init r=0, world=2 2022-08-17T13:31:33.9643526Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:31:33.9644582Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:33.9672442Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:35.0067725Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:31:35.0068220Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:31:35.0320340Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:35.0321091Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:35.0321802Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:35.0322642Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:35.5869820Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:35.5871196Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:35.5872463Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:35.5873734Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:35.7958441Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:35.7959737Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:35.7961020Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:35.7962277Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:36.4438064Z ok (5.716s) 2022-08-17T13:31:36.4456708Z test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp32_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65790 2022-08-17T13:31:36.4463110Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65791 2022-08-17T13:31:37.8984860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:37.8985865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:37.8987055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:37.8988009Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:37.9066774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:37.9067745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:37.9072894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:37.9073903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:39.6332608Z dist init r=0, world=2 2022-08-17T13:31:39.6336425Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:31:39.6390745Z dist init r=1, world=2 2022-08-17T13:31:39.6396257Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:31:39.6397664Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:39.6440402Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:40.6984687Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:31:40.6999565Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:31:40.7057111Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:40.7058231Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:40.7239693Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:40.7241008Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:41.2446051Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:41.2448698Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:41.2451270Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:41.2453833Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:41.4154053Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:41.4156594Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:41.4159113Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:41.4161844Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:42.0605198Z ok (5.617s) 2022-08-17T13:31:42.0622972Z test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp64_enable_sharded_grad_scaler (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65874 2022-08-17T13:31:42.0628734Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65875 2022-08-17T13:31:43.5505057Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:43.5505592Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:43.5507467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:43.5507962Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:43.5714988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:43.5715465Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:43.5719038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:43.5719520Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:45.2975316Z dist init r=0, world=2 2022-08-17T13:31:45.2979084Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:31:45.3160439Z dist init r=1, world=2 2022-08-17T13:31:45.3165469Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:31:45.3166277Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:45.3183791Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:46.3948306Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:31:46.3948829Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:31:46.4200431Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:46.4201012Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:46.4201740Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:46.4202565Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:46.9860869Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:46.9862201Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:46.9864075Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:46.9865352Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:47.1824028Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:47.1825667Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:47.1826943Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:47.1828202Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:47.7766598Z ok (5.716s) 2022-08-17T13:31:47.7784938Z test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp64_none (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65958 2022-08-17T13:31:47.7791211Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65959 2022-08-17T13:31:49.2402698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:49.2403202Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:49.2405806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:49.2406308Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:49.2415436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:49.2415914Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:49.2420241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:49.2420762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:50.9720362Z dist init r=1, world=2 2022-08-17T13:31:50.9724462Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:31:50.9939653Z dist init r=0, world=2 2022-08-17T13:31:50.9945110Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:31:50.9946047Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:51.0030769Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:52.0389326Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:31:52.0389888Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:31:52.0642083Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:52.0642667Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:52.0643606Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:52.0644149Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:52.6140352Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:52.6141989Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:52.6143952Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:52.6145816Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:52.7856732Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:52.7858303Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:52.7859780Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:52.7861192Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:31:53.3924547Z ok (5.616s) 2022-08-17T13:31:53.3942375Z test_mixed_precision_no_reshard_after_forward (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66042 2022-08-17T13:31:53.3948613Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66043 2022-08-17T13:31:54.8284808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:54.8285290Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:54.8287370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:54.8287856Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:54.8898302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:31:54.8898751Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:31:54.8902915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:31:54.8903402Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:31:56.5293575Z dist init r=0, world=2 2022-08-17T13:31:56.5297025Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:31:56.6035549Z dist init r=1, world=2 2022-08-17T13:31:56.6041181Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:31:56.6042158Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:56.6112427Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:31:57.6807672Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:31:57.6808231Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:31:57.7100774Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:57.7101372Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:57.7102065Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:31:57.7102604Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:31:58.7081652Z ok (5.316s) 2022-08-17T13:31:58.7093598Z test_mixed_precision_resnet (__main__.TestFSDPMixedPrecisionSharded) 2022-08-17T13:31:58.7094368Z End to end test to ensure mixed precision + auto_wrap works ... skip: no torchvision (0.001s) 2022-08-17T13:31:58.7124288Z test_mp_batchnorm_convert_sync_bn_False (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66126 2022-08-17T13:31:58.7130154Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66127 2022-08-17T13:32:00.1869610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:00.1870173Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:00.1873537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:00.1874404Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:00.1926814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:00.1927280Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:00.1931107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:00.1931600Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:01.9311848Z dist init r=0, world=2 2022-08-17T13:32:01.9316141Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:32:01.9517293Z dist init r=1, world=2 2022-08-17T13:32:01.9522493Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:32:01.9523863Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:01.9622991Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:03.0339879Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:32:03.0340394Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:32:03.0640320Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:03.0640908Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:03.0641626Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:03.0642147Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:04.3266965Z ok (5.617s) 2022-08-17T13:32:04.3299106Z test_mp_batchnorm_convert_sync_bn_True (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66210 2022-08-17T13:32:04.3305163Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66211 2022-08-17T13:32:05.7885106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:05.7885616Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:05.7888291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:05.7888782Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:05.7964584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:05.7965049Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:05.7969476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:05.7969957Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:07.5089433Z dist init r=1, world=2 2022-08-17T13:32:07.5093548Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:32:07.5232474Z dist init r=0, world=2 2022-08-17T13:32:07.5238683Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:32:07.5239546Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:07.5299220Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:08.5839412Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:32:08.5839925Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:32:08.6125188Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:08.6125765Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:08.6159798Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:08.6160654Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:09.6433547Z ok (5.317s) 2022-08-17T13:32:09.6449680Z test_mp_embedding_default (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66294 2022-08-17T13:32:09.6455396Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66295 2022-08-17T13:32:11.1368006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:11.1368506Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:11.1371091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:11.1371577Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:11.1728538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:11.1729009Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:11.1732838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:11.1733314Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:12.8791188Z dist init r=0, world=2 2022-08-17T13:32:12.8795528Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:32:12.8855573Z dist init r=1, world=2 2022-08-17T13:32:12.8860842Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:32:12.8861986Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:12.8899117Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:13.9245454Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:32:13.9246383Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:32:13.9611470Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:13.9612028Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:13.9612734Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:13.9613281Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:15.0587750Z ok (5.415s) 2022-08-17T13:32:15.0604576Z test_mp_embedding_only_params_and_bufs (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66378 2022-08-17T13:32:15.0610727Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66379 2022-08-17T13:32:16.4842097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:16.4842627Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:16.4844872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:16.4845357Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:16.5281700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:16.5282164Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:16.5286053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:16.5286696Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:18.1647802Z dist init r=0, world=2 2022-08-17T13:32:18.1652098Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:32:18.1956830Z dist init r=1, world=2 2022-08-17T13:32:18.1961758Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:32:18.1962783Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:18.2060831Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:19.2306391Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:32:19.2306933Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:32:19.2689551Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:19.2690355Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:19.2691038Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:19.2691579Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:20.3741081Z ok (5.315s) 2022-08-17T13:32:20.3757767Z test_mp_embedding_params_and_reduce_diff (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66462 2022-08-17T13:32:20.3763807Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66463 2022-08-17T13:32:21.8448919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:21.8449419Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:21.8451656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:21.8452121Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:21.8708563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:21.8709023Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:21.8713156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:21.8713621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:23.5434704Z dist init r=0, world=2 2022-08-17T13:32:23.5438657Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:32:23.5804756Z dist init r=1, world=2 2022-08-17T13:32:23.5810204Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:32:23.5811032Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:23.5846886Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:24.6351307Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:32:24.6351816Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:32:24.6728253Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:24.6729135Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:24.6729840Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:24.6730362Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:25.6894417Z ok (5.315s) 2022-08-17T13:32:25.6911613Z test_mp_embedding_reduce (__main__.TestFSDPMixedPrecisionSharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66546 2022-08-17T13:32:25.6917679Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66547 2022-08-17T13:32:27.1729972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:27.1730476Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:27.1732031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:27.1732520Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:27.2022848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:27.2023317Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:27.2027642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:27.2028123Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:28.8746651Z dist init r=0, world=2 2022-08-17T13:32:28.8751321Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:32:28.9045168Z dist init r=1, world=2 2022-08-17T13:32:28.9050336Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:32:28.9051234Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:28.9057599Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:29.9819810Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:32:29.9820337Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:32:30.0199178Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:30.0199735Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:30.0202808Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:30.0203384Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:31.4055395Z ok (5.716s) 2022-08-17T13:32:31.4074358Z test_mixed_precision_e2e_full_shard (__main__.TestFSDPMixedPrecisionUnsharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66630 2022-08-17T13:32:32.8570971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:32.8571474Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:32.8573917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:32.8574418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:34.5342854Z dist init r=0, world=1 2022-08-17T13:32:34.5346611Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:32:34.5347762Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:32:34.5821389Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:32:34.5893419Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:34.5894012Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:35.4166907Z ok (4.011s) 2022-08-17T13:32:35.4184366Z test_mixed_precision_no_reshard_after_forward (__main__.TestFSDPMixedPrecisionUnsharded) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66672 2022-08-17T13:32:36.8709894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:36.8710408Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:36.8712862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:36.8713366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:38.5565450Z dist init r=0, world=1 2022-08-17T13:32:38.5569522Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:32:38.5570605Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:32:38.6035317Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:32:38.6109983Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:38.6110555Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:39.5278823Z ok (4.111s) 2022-08-17T13:32:39.5279117Z 2022-08-17T13:32:39.5279776Z ---------------------------------------------------------------------- 2022-08-17T13:32:39.5280350Z Ran 50 tests in 272.195s 2022-08-17T13:32:39.5280942Z 2022-08-17T13:32:39.5281161Z OK (skipped=1) 2022-08-17T13:32:39.5281399Z 2022-08-17T13:32:39.5281517Z Generating XML reports... 2022-08-17T13:32:39.5370421Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_mixed_precision/TEST-TestFSDPMixedPrecisionSharded-20220817132807.xml 2022-08-17T13:32:39.5374353Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_mixed_precision/TEST-TestFSDPMixedPrecisionUnsharded-20220817132807.xml 2022-08-17T13:32:39.8788426Z Running distributed/fsdp/test_fsdp_summon_full_params ... [2022-08-17 13:32:39.878380] 2022-08-17T13:32:39.8789209Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_summon_full_params.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:32:39.878455] 2022-08-17T13:32:41.5086328Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_summon_full_params 2022-08-17T13:32:41.5111541Z 2022-08-17T13:32:41.5111831Z Running tests... 2022-08-17T13:32:41.5112538Z ---------------------------------------------------------------------- 2022-08-17T13:32:43.0132532Z test_cannot_summon_full_params_from_backward (__main__.TestSummonFullParams) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:32:43.0317839Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66749 2022-08-17T13:32:43.0323662Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66750 2022-08-17T13:32:44.4916387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:44.4916882Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:44.4919426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:44.4919935Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:44.5302320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:44.5302791Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:44.5307077Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:44.5307570Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:44.6642490Z dist init r=0, world=2 2022-08-17T13:32:44.6646376Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:32:44.6996256Z dist init r=1, world=2 2022-08-17T13:32:44.7000732Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:32:44.7001452Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:44.7055021Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:46.0717529Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:32:46.0718077Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:32:46.0927369Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:32:46.0928256Z warnings.warn( 2022-08-17T13:32:46.0946357Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:46.0946920Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:46.0961033Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:32:46.0961800Z warnings.warn( 2022-08-17T13:32:46.0979332Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:46.0979902Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:46.5102127Z Asserting FSDP instance is: FullyShardedDataParallel( 2022-08-17T13:32:46.5102587Z (_fsdp_wrapped_module): FlattenParamsWrapper( 2022-08-17T13:32:46.5102960Z (_fpw_module): Linear(in_features=2, out_features=1, bias=True) 2022-08-17T13:32:46.5103253Z ) 2022-08-17T13:32:46.5103706Z ) 2022-08-17T13:32:46.5104084Z ERROR: expected to be in states [] but current state is TrainingState_.BACKWARD_PRE 2022-08-17T13:32:46.5104713Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_summon_full_params.py", line 222, in bad_backwards_hook 2022-08-17T13:32:46.5105304Z with model.summon_full_params(model): 2022-08-17T13:32:46.5105649Z File "/opt/conda/lib/python3.10/contextlib.py", line 135, in __enter__ 2022-08-17T13:32:46.5106128Z return next(self.gen) 2022-08-17T13:32:46.5106811Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 2685, in summon_full_params 2022-08-17T13:32:46.5107209Z stack.enter_context( 2022-08-17T13:32:46.5107558Z File "/opt/conda/lib/python3.10/contextlib.py", line 492, in enter_context 2022-08-17T13:32:46.5108118Z result = _cm_type.__enter__(cm) 2022-08-17T13:32:46.5108452Z File "/opt/conda/lib/python3.10/contextlib.py", line 135, in __enter__ 2022-08-17T13:32:46.5108791Z return next(self.gen) 2022-08-17T13:32:46.5109357Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 2534, in _summon_full_params 2022-08-17T13:32:46.5109767Z stack.enter_context( 2022-08-17T13:32:46.5110088Z File "/opt/conda/lib/python3.10/contextlib.py", line 492, in enter_context 2022-08-17T13:32:46.5110425Z result = _cm_type.__enter__(cm) 2022-08-17T13:32:46.5110768Z File "/opt/conda/lib/python3.10/contextlib.py", line 135, in __enter__ 2022-08-17T13:32:46.5111080Z return next(self.gen) 2022-08-17T13:32:46.5111642Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 2549, in _summon_full_params 2022-08-17T13:32:46.5112083Z self._assert_state([TrainingState_.IDLE]) 2022-08-17T13:32:46.5112655Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 3567, in _assert_state 2022-08-17T13:32:46.5113051Z traceback.print_stack() 2022-08-17T13:32:46.9422759Z ok (5.431s) 2022-08-17T13:32:46.9442339Z test_cannot_summon_full_params_from_forward (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66832 2022-08-17T13:32:46.9448332Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66833 2022-08-17T13:32:48.3632426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:48.3632947Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:48.3635444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:48.3635939Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:48.4103295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:48.4103754Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:48.4108068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:48.4108556Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:48.5287317Z dist init r=0, world=2 2022-08-17T13:32:48.5290978Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:32:48.5752884Z dist init r=1, world=2 2022-08-17T13:32:48.5757455Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:32:48.5758470Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:48.5801159Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:49.9214682Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:32:49.9215643Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:32:49.9230691Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:32:49.9232636Z warnings.warn( 2022-08-17T13:32:49.9234404Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:32:49.9235146Z warnings.warn( 2022-08-17T13:32:49.9451098Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:49.9451803Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:49.9453084Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:32:49.9453954Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:32:49.9467720Z Asserting FSDP instance is: FullyShardedDataParallel( 2022-08-17T13:32:49.9468482Z (_fsdp_wrapped_module): FlattenParamsWrapper( 2022-08-17T13:32:49.9469080Z (_fpw_module): MyModule() 2022-08-17T13:32:49.9469544Z ) 2022-08-17T13:32:49.9469908Z ) 2022-08-17T13:32:49.9470609Z ERROR: expected to be in states [] but current state is TrainingState_.FORWARD 2022-08-17T13:32:49.9481498Z File "", line 1, in 2022-08-17T13:32:49.9482062Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-08-17T13:32:49.9482763Z exitcode = _main(fd, parent_sentinel) 2022-08-17T13:32:49.9483467Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-08-17T13:32:49.9484203Z return self._bootstrap(parent_sentinel) 2022-08-17T13:32:49.9484883Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-08-17T13:32:49.9485510Z self.run() 2022-08-17T13:32:49.9486116Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-08-17T13:32:49.9486761Z self._target(*self._args, **self._kwargs) 2022-08-17T13:32:49.9487776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 785, in _run 2022-08-17T13:32:49.9488633Z self.run_test(test_name, pipe) 2022-08-17T13:32:49.9489798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 622, in run_test 2022-08-17T13:32:49.9490629Z getattr(self, test_name)() 2022-08-17T13:32:49.9491803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 503, in wrapper 2022-08-17T13:32:49.9492609Z fn() 2022-08-17T13:32:49.9493676Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 145, in wrapper 2022-08-17T13:32:49.9494786Z return func(*args, **kwargs) 2022-08-17T13:32:49.9495769Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_summon_full_params.py", line 213, in test_cannot_summon_full_params_from_forward 2022-08-17T13:32:49.9496652Z model(model) 2022-08-17T13:32:49.9497637Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1185, in _call_impl 2022-08-17T13:32:49.9498460Z return forward_call(*input, **kwargs) 2022-08-17T13:32:49.9499648Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 2441, in forward 2022-08-17T13:32:49.9500630Z outputs = self._fsdp_wrapped_module(*args, **kwargs) 2022-08-17T13:32:49.9501750Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1185, in _call_impl 2022-08-17T13:32:49.9502713Z return forward_call(*input, **kwargs) 2022-08-17T13:32:49.9504266Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flatten_params_wrapper.py", line 156, in forward 2022-08-17T13:32:49.9505155Z return self.module(*inputs, **kwinputs) 2022-08-17T13:32:49.9506255Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1185, in _call_impl 2022-08-17T13:32:49.9507060Z return forward_call(*input, **kwargs) 2022-08-17T13:32:49.9507936Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_summon_full_params.py", line 206, in forward 2022-08-17T13:32:49.9508836Z with fsdp_module.summon_full_params(fsdp_module): 2022-08-17T13:32:49.9509631Z File "/opt/conda/lib/python3.10/contextlib.py", line 135, in __enter__ 2022-08-17T13:32:49.9510312Z return next(self.gen) 2022-08-17T13:32:49.9511512Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 2685, in summon_full_params 2022-08-17T13:32:49.9512409Z stack.enter_context( 2022-08-17T13:32:49.9513109Z File "/opt/conda/lib/python3.10/contextlib.py", line 492, in enter_context 2022-08-17T13:32:49.9513816Z result = _cm_type.__enter__(cm) 2022-08-17T13:32:49.9514525Z File "/opt/conda/lib/python3.10/contextlib.py", line 135, in __enter__ 2022-08-17T13:32:49.9515193Z return next(self.gen) 2022-08-17T13:32:49.9516411Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 2534, in _summon_full_params 2022-08-17T13:32:49.9517273Z stack.enter_context( 2022-08-17T13:32:49.9517975Z File "/opt/conda/lib/python3.10/contextlib.py", line 492, in enter_context 2022-08-17T13:32:49.9518679Z result = _cm_type.__enter__(cm) 2022-08-17T13:32:49.9519379Z File "/opt/conda/lib/python3.10/contextlib.py", line 135, in __enter__ 2022-08-17T13:32:49.9520055Z return next(self.gen) 2022-08-17T13:32:49.9521271Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 2549, in _summon_full_params 2022-08-17T13:32:49.9522221Z self._assert_state([TrainingState_.IDLE]) 2022-08-17T13:32:49.9523492Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 3567, in _assert_state 2022-08-17T13:32:49.9524641Z traceback.print_stack() 2022-08-17T13:32:50.3544447Z ok (3.412s) 2022-08-17T13:32:50.3557784Z test_named_parameters_buffers_prefix__recurse_False (__main__.TestSummonFullParams) 2022-08-17T13:32:50.3571120Z Tests that ``named_parameters()`` and ``named_buffers()`` for a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66911 2022-08-17T13:32:50.3577299Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66912 2022-08-17T13:32:51.7833637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:51.7834160Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:51.7836266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:51.7837202Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:51.7997883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:51.7998606Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:51.8002388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:51.8003129Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:51.9499134Z dist init r=1, world=2 2022-08-17T13:32:51.9503059Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:32:51.9725369Z dist init r=0, world=2 2022-08-17T13:32:51.9730304Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:32:51.9731107Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:51.9809083Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:53.3687779Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:32:53.3688726Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:32:53.7665206Z ok (3.412s) 2022-08-17T13:32:53.7677111Z test_named_parameters_buffers_prefix__recurse_True (__main__.TestSummonFullParams) 2022-08-17T13:32:53.7690336Z Tests that ``named_parameters()`` and ``named_buffers()`` for a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66990 2022-08-17T13:32:53.7696214Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66991 2022-08-17T13:32:55.2791731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:55.2792235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:55.2794683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:55.2795411Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:55.2823278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:55.2824028Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:55.2828241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:55.2828982Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:55.4550243Z dist init r=1, world=2 2022-08-17T13:32:55.4554102Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:32:55.4641465Z dist init r=0, world=2 2022-08-17T13:32:55.4645466Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:32:55.4646313Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:55.4657215Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:56.8404607Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:32:56.8405126Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:32:57.2783173Z ok (3.512s) 2022-08-17T13:32:57.2794331Z test_named_parameters_buffers_prefix_test_prefix_recurse_False (__main__.TestSummonFullParams) 2022-08-17T13:32:57.2807020Z Tests that ``named_parameters()`` and ``named_buffers()`` for a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67069 2022-08-17T13:32:57.2812983Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67070 2022-08-17T13:32:58.7026728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:58.7027235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:58.7029875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:58.7030350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:58.7759687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:32:58.7760355Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:32:58.7763401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:32:58.7763886Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:32:58.8732621Z dist init r=0, world=2 2022-08-17T13:32:58.8737269Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:32:58.9482753Z dist init r=1, world=2 2022-08-17T13:32:58.9487571Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:32:58.9488285Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:32:58.9552214Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:00.3179835Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:00.3180799Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:00.7898578Z ok (3.511s) 2022-08-17T13:33:00.7909188Z test_named_parameters_buffers_prefix_test_prefix_recurse_True (__main__.TestSummonFullParams) 2022-08-17T13:33:00.7922075Z Tests that ``named_parameters()`` and ``named_buffers()`` for a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67148 2022-08-17T13:33:00.7928170Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67149 2022-08-17T13:33:02.2242825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:02.2243341Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:02.2245049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:02.2245564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:02.2503351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:02.2503807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:02.2508224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:02.2508713Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:02.3911023Z dist init r=1, world=2 2022-08-17T13:33:02.3914754Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:33:02.4229409Z dist init r=0, world=2 2022-08-17T13:33:02.4234185Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:33:02.4235273Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:02.4323704Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:03.8037022Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:03.8037600Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:04.3016732Z ok (3.512s) 2022-08-17T13:33:04.3044799Z test_params_are_unflattenned_rank0_only_False_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67227 2022-08-17T13:33:04.3050771Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67228 2022-08-17T13:33:05.7570102Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:05.7570837Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:05.7572974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:05.7573460Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:05.7813329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:05.7813773Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:05.7817996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:05.7818470Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:05.9269283Z dist init r=1, world=2 2022-08-17T13:33:05.9273122Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:33:05.9542388Z dist init r=0, world=2 2022-08-17T13:33:05.9547394Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:33:05.9548167Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:05.9579552Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:07.3431646Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:07.3432185Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:07.8138764Z ok (3.512s) 2022-08-17T13:33:07.8168800Z test_params_are_unflattenned_rank0_only_False_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67306 2022-08-17T13:33:07.8174908Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67307 2022-08-17T13:33:09.2221175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:09.2222156Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:09.2223633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:09.2224572Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:09.2263623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:09.2264796Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:09.2269956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:09.2270961Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:09.3877440Z dist init r=1, world=2 2022-08-17T13:33:09.3881585Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:33:09.4032743Z dist init r=0, world=2 2022-08-17T13:33:09.4038903Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:33:09.4040204Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:09.4085843Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:10.8298078Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:10.8298645Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:11.3260578Z ok (3.512s) 2022-08-17T13:33:11.3290826Z test_params_are_unflattenned_rank0_only_False_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67385 2022-08-17T13:33:11.3296837Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67386 2022-08-17T13:33:12.7678426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:12.7678968Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:12.7681266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:12.7681747Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:12.8023132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:12.8023598Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:12.8028648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:12.8029137Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:12.9357743Z dist init r=1, world=2 2022-08-17T13:33:12.9361412Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:33:12.9750365Z dist init r=0, world=2 2022-08-17T13:33:12.9755138Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:33:12.9756024Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:12.9768578Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:14.3349386Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:14.3349925Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:14.3585546Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:2505: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-08-17T13:33:14.3586252Z warnings.warn( 2022-08-17T13:33:14.3619469Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:2505: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-08-17T13:33:14.3620142Z warnings.warn( 2022-08-17T13:33:14.7384408Z ok (3.412s) 2022-08-17T13:33:14.7413788Z test_params_are_unflattenned_rank0_only_False_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67464 2022-08-17T13:33:14.7419325Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67465 2022-08-17T13:33:16.1773482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:16.1773995Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:16.1776425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:16.1776907Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:16.1996096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:16.1996832Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:16.2000792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:16.2001268Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:16.3456579Z dist init r=0, world=2 2022-08-17T13:33:16.3460254Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:33:16.3735640Z dist init r=1, world=2 2022-08-17T13:33:16.3740299Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:33:16.3741349Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:16.3766862Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:17.7482914Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:17.7483459Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:17.7705925Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:2505: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-08-17T13:33:17.7706644Z warnings.warn( 2022-08-17T13:33:17.7707628Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:2505: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-08-17T13:33:17.7708308Z warnings.warn( 2022-08-17T13:33:18.1505151Z ok (3.412s) 2022-08-17T13:33:18.1532335Z test_params_are_unflattenned_rank0_only_True_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67543 2022-08-17T13:33:18.1537941Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67544 2022-08-17T13:33:19.6051808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:19.6052348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:19.6055262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:19.6055748Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:19.6265092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:19.6265568Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:19.6269997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:19.6270502Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:19.7813329Z dist init r=0, world=2 2022-08-17T13:33:19.7817791Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:33:19.7920952Z dist init r=1, world=2 2022-08-17T13:33:19.7925235Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:33:19.7926322Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:19.8023354Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:21.1667384Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:21.1667952Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:21.6627070Z ok (3.512s) 2022-08-17T13:33:21.6654206Z test_params_are_unflattenned_rank0_only_True_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67622 2022-08-17T13:33:21.6660082Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67623 2022-08-17T13:33:23.1274007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:23.1274499Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:23.1277036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:23.1277538Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:23.1361671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:23.1362113Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:23.1366305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:23.1366780Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:23.2965032Z dist init r=0, world=2 2022-08-17T13:33:23.2969312Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:33:23.3076105Z dist init r=1, world=2 2022-08-17T13:33:23.3080888Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:33:23.3081814Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:23.3174804Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:24.6840551Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:24.6841076Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:25.0745417Z ok (3.412s) 2022-08-17T13:33:25.0772482Z test_params_are_unflattenned_rank0_only_True_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67701 2022-08-17T13:33:25.0778345Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67702 2022-08-17T13:33:26.4830232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:26.4830735Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:26.4833974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:26.4834474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:26.5262965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:26.5263409Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:26.5267765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:26.5268238Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:26.6528893Z dist init r=0, world=2 2022-08-17T13:33:26.6532241Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:33:26.7002344Z dist init r=1, world=2 2022-08-17T13:33:26.7006793Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:33:26.7008007Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:26.7042276Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:28.0922271Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:28.0922791Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:28.4861621Z ok (3.412s) 2022-08-17T13:33:28.4895566Z test_params_are_unflattenned_rank0_only_True_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67780 2022-08-17T13:33:28.4901551Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67781 2022-08-17T13:33:29.9601702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:29.9602186Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:29.9604747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:29.9605226Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:29.9964825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:29.9965294Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:29.9969425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:29.9969909Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:30.1272229Z dist init r=0, world=2 2022-08-17T13:33:30.1275733Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:33:30.1617070Z dist init r=1, world=2 2022-08-17T13:33:30.1621773Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:33:30.1622614Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:30.1684232Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:31.5085010Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:31.5085544Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:31.9991780Z ok (3.513s) 2022-08-17T13:33:32.0018774Z test_params_count_and_value_rank0_only_False_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67859 2022-08-17T13:33:32.0024715Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67860 2022-08-17T13:33:33.4290626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:33.4291123Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:33.4293553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:33.4294310Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:33.4388605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:33.4389537Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:33.4393367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:33.4394117Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:33.5961848Z dist init r=0, world=2 2022-08-17T13:33:33.5965526Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:33:33.6107606Z dist init r=1, world=2 2022-08-17T13:33:33.6112305Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:33:33.6113403Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:33.6170696Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:34.9670224Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:34.9670801Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:35.4108109Z ok (3.412s) 2022-08-17T13:33:35.4134169Z test_params_count_and_value_rank0_only_False_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67938 2022-08-17T13:33:35.4139987Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67939 2022-08-17T13:33:36.7767010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:36.7767967Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:36.7769915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:36.7770819Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:36.8498747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:36.8499707Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:36.8503116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:36.8504347Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:36.9446915Z dist init r=1, world=2 2022-08-17T13:33:36.9451493Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:33:37.0224025Z dist init r=0, world=2 2022-08-17T13:33:37.0229060Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:33:37.0229811Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:37.0266646Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:38.3946968Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:38.3947754Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:38.8227438Z ok (3.412s) 2022-08-17T13:33:38.8252970Z test_params_count_and_value_rank0_only_False_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68017 2022-08-17T13:33:38.8258763Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68018 2022-08-17T13:33:40.2328692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:40.2329707Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:40.2331124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:40.2331645Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:40.2646864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:40.2647799Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:40.2651713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:40.2652686Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:40.4002906Z dist init r=1, world=2 2022-08-17T13:33:40.4007202Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:33:40.4383675Z dist init r=0, world=2 2022-08-17T13:33:40.4388211Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:33:40.4389227Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:40.4415744Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:41.8166157Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:41.8166686Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:42.2345627Z ok (3.412s) 2022-08-17T13:33:42.2372896Z test_params_count_and_value_rank0_only_False_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68096 2022-08-17T13:33:42.2378691Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68097 2022-08-17T13:33:43.6975734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:43.6976681Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:43.6979570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:43.6980535Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:43.7027764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:43.7028697Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:43.7033851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:43.7034819Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:43.8661190Z dist init r=1, world=2 2022-08-17T13:33:43.8665102Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:33:43.8696864Z dist init r=0, world=2 2022-08-17T13:33:43.8702170Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:33:43.8703931Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:43.8769145Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:45.2255712Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:45.2256477Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:45.6464975Z ok (3.412s) 2022-08-17T13:33:45.6492039Z test_params_count_and_value_rank0_only_True_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68175 2022-08-17T13:33:45.6497869Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68176 2022-08-17T13:33:47.1294425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:47.1294938Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:47.1297410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:47.1297898Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:47.1488142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:47.1488607Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:47.1492662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:47.1493162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:47.3034479Z dist init r=0, world=2 2022-08-17T13:33:47.3038510Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:33:47.3156552Z dist init r=1, world=2 2022-08-17T13:33:47.3161061Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:33:47.3161834Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:47.3243889Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:48.7105740Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:48.7106679Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:49.1586968Z ok (3.512s) 2022-08-17T13:33:49.1611982Z test_params_count_and_value_rank0_only_True_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68254 2022-08-17T13:33:49.1618206Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68255 2022-08-17T13:33:50.5915546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:50.5916069Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:50.5918311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:50.5918801Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:50.6073867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:50.6074348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:50.6078751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:50.6079293Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:50.7587422Z dist init r=0, world=2 2022-08-17T13:33:50.7591450Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:33:50.7797863Z dist init r=1, world=2 2022-08-17T13:33:50.7802810Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:33:50.7803548Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:50.7898507Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:52.1447263Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:52.1448190Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:52.5703645Z ok (3.412s) 2022-08-17T13:33:52.5730823Z test_params_count_and_value_rank0_only_True_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68333 2022-08-17T13:33:52.5736729Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68334 2022-08-17T13:33:53.9840988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:53.9841492Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:53.9844431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:53.9844938Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:54.0150019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:54.0150483Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:54.0154734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:54.0155223Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:54.1507513Z dist init r=0, world=2 2022-08-17T13:33:54.1511245Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:33:54.1877647Z dist init r=1, world=2 2022-08-17T13:33:54.1882540Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:33:54.1883383Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:54.1919369Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:55.5548170Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:55.5549072Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:55.9823364Z ok (3.412s) 2022-08-17T13:33:55.9849298Z test_params_count_and_value_rank0_only_True_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68412 2022-08-17T13:33:55.9854808Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68413 2022-08-17T13:33:57.4697254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:57.4697827Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:57.4700834Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:57.4701571Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:57.4718008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:33:57.4718449Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:33:57.4723067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:33:57.4723632Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:33:57.6444961Z dist init r=0, world=2 2022-08-17T13:33:57.6449077Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:33:57.6481737Z dist init r=1, world=2 2022-08-17T13:33:57.6486432Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:33:57.6487196Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:57.6552617Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:33:59.0239943Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:33:59.0240467Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:33:59.4943666Z ok (3.512s) 2022-08-17T13:33:59.4947971Z test_raises_rank0_with_writeback (__main__.TestSummonFullParams) 2022-08-17T13:33:59.4960804Z Tests that ``summon_full_params()`` with both ``rank0_only=True`` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68491 2022-08-17T13:33:59.4966618Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68492 2022-08-17T13:34:00.9755598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:00.9756078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:00.9759467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:00.9759941Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:01.0076911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:01.0077359Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:01.0080801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:01.0081292Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:01.1447090Z dist init r=1, world=2 2022-08-17T13:34:01.1451209Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:34:01.1811222Z dist init r=0, world=2 2022-08-17T13:34:01.1816279Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:34:01.1817017Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:01.1860277Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:02.5743695Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:34:02.5744826Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:34:03.0055027Z ok (3.511s) 2022-08-17T13:34:03.0086726Z test_reshard_outside_forward_backward_iteration_rank0_only_False_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68570 2022-08-17T13:34:03.0092238Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68571 2022-08-17T13:34:04.4459894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:04.4460401Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:04.4462942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:04.4463432Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:04.4586132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:04.4586837Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:04.4591563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:04.4592075Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:04.6184378Z dist init r=0, world=2 2022-08-17T13:34:04.6188763Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:34:04.6243110Z dist init r=1, world=2 2022-08-17T13:34:04.6247970Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:34:04.6248749Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:04.6292143Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:06.0005739Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:34:06.0006265Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:34:06.0205571Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:06.0206350Z warnings.warn( 2022-08-17T13:34:06.0207442Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:06.0208194Z warnings.warn( 2022-08-17T13:34:06.0234216Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:34:06.0234775Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:34:06.0235465Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:34:06.0235988Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:34:06.9189152Z ok (3.913s) 2022-08-17T13:34:06.9221103Z test_reshard_outside_forward_backward_iteration_rank0_only_False_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68653 2022-08-17T13:34:06.9227402Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68654 2022-08-17T13:34:08.3638306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:08.3638857Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:08.3640829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:08.3641314Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:08.3927504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:08.3928003Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:08.3931080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:08.3931782Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:08.5319070Z dist init r=0, world=2 2022-08-17T13:34:08.5322896Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:34:08.5673727Z dist init r=1, world=2 2022-08-17T13:34:08.5678597Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:34:08.5679319Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:08.5731207Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:09.9263519Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:34:09.9264262Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:34:09.9485356Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:09.9486144Z warnings.warn( 2022-08-17T13:34:09.9513065Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:34:09.9513612Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:34:09.9520745Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:09.9521502Z warnings.warn( 2022-08-17T13:34:09.9549945Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:34:09.9550503Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:34:10.8327276Z ok (3.914s) 2022-08-17T13:34:10.8359226Z test_reshard_outside_forward_backward_iteration_rank0_only_False_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68736 2022-08-17T13:34:10.8365021Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68737 2022-08-17T13:34:12.2653797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:12.2654296Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:12.2656990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:12.2657497Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:12.3454751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:12.3455207Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:12.3459007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:12.3459487Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:12.4316973Z dist init r=1, world=2 2022-08-17T13:34:12.4321474Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:34:12.5187738Z dist init r=0, world=2 2022-08-17T13:34:12.5192115Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:34:12.5193142Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:12.5237840Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:13.9057316Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:34:13.9057846Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:34:13.9285329Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:13.9286142Z warnings.warn( 2022-08-17T13:34:13.9314431Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:34:13.9314994Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:34:13.9320509Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:13.9321274Z warnings.warn( 2022-08-17T13:34:13.9348752Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:34:13.9349295Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:34:14.3512591Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:2505: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-08-17T13:34:14.3513308Z warnings.warn( 2022-08-17T13:34:14.3516686Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:2505: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-08-17T13:34:14.3517672Z warnings.warn( 2022-08-17T13:34:14.8464414Z ok (4.014s) 2022-08-17T13:34:14.8496775Z test_reshard_outside_forward_backward_iteration_rank0_only_False_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68819 2022-08-17T13:34:14.8503178Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68820 2022-08-17T13:34:16.3107104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:16.3107625Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:16.3110245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:16.3110975Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:16.3112925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:16.3113386Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:16.3117661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:16.3118119Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:16.4849619Z dist init r=1, world=2 2022-08-17T13:34:16.4853464Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:34:16.4891760Z dist init r=0, world=2 2022-08-17T13:34:16.4896195Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:34:16.4897082Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:16.4956903Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:17.8728677Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:34:17.8729205Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:34:17.8925026Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:17.8925827Z warnings.warn( 2022-08-17T13:34:17.8926924Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:17.8927672Z warnings.warn( 2022-08-17T13:34:17.8953260Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:34:17.8953823Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:34:17.8954517Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:34:17.8955045Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:34:18.3192828Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:2505: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-08-17T13:34:18.3193588Z warnings.warn( 2022-08-17T13:34:18.3196500Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:2505: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-08-17T13:34:18.3197169Z warnings.warn( 2022-08-17T13:34:18.7601423Z ok (3.914s) 2022-08-17T13:34:18.7634665Z test_reshard_outside_forward_backward_iteration_rank0_only_True_offload_to_cpu_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68902 2022-08-17T13:34:18.7640549Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68903 2022-08-17T13:34:20.2246810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:20.2247374Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:20.2249696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:20.2250182Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:20.2925214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:20.2925698Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:20.2929465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:20.2929942Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:20.3913494Z dist init r=1, world=2 2022-08-17T13:34:20.3917583Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:34:20.4640545Z dist init r=0, world=2 2022-08-17T13:34:20.4644941Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:34:20.4645945Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:20.4732290Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:21.8306771Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:34:21.8307310Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:34:21.8525425Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:21.8526225Z warnings.warn( 2022-08-17T13:34:21.8527333Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:21.8528094Z warnings.warn( 2022-08-17T13:34:21.8554061Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:34:21.8554647Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:34:21.8555330Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:34:21.8555874Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:34:22.7741878Z ok (4.014s) 2022-08-17T13:34:22.7775611Z test_reshard_outside_forward_backward_iteration_rank0_only_True_offload_to_cpu_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68985 2022-08-17T13:34:22.7781639Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68986 2022-08-17T13:34:24.1978055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:24.1978565Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:24.1980792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:24.1981281Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:24.2393536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:24.2394003Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:24.2398053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:24.2398544Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:24.3644971Z dist init r=0, world=2 2022-08-17T13:34:24.3648938Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:34:24.4127691Z dist init r=1, world=2 2022-08-17T13:34:24.4132359Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:34:24.4133128Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:24.4158845Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:25.8342723Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:34:25.8343246Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:34:25.8564786Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:25.8565584Z warnings.warn( 2022-08-17T13:34:25.8566676Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:25.8567441Z warnings.warn( 2022-08-17T13:34:25.8592703Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:34:25.8593584Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:34:25.8594373Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:34:25.8594943Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:34:26.6875617Z ok (3.913s) 2022-08-17T13:34:26.6908682Z test_reshard_outside_forward_backward_iteration_rank0_only_True_offload_to_cpu_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69068 2022-08-17T13:34:26.6914511Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69069 2022-08-17T13:34:28.1946163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:28.1946708Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:28.1949272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:28.1949754Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:28.2005940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:28.2006410Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:28.2010509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:28.2010983Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:28.3694917Z dist init r=0, world=2 2022-08-17T13:34:28.3698030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:34:28.3707431Z dist init r=1, world=2 2022-08-17T13:34:28.3712004Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:34:28.3712955Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:28.3801605Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:29.7495435Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:34:29.7495950Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:34:29.7724481Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:29.7725286Z warnings.warn( 2022-08-17T13:34:29.7726405Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:29.7727127Z warnings.warn( 2022-08-17T13:34:29.7752408Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:34:29.7752975Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:34:29.7753926Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:34:29.7754467Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:34:30.7013936Z ok (4.014s) 2022-08-17T13:34:30.7045903Z test_reshard_outside_forward_backward_iteration_rank0_only_True_offload_to_cpu_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69151 2022-08-17T13:34:30.7051807Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69152 2022-08-17T13:34:32.1825287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:32.1825787Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:32.1828239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:32.1828751Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:32.2407076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:32.2407542Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:32.2411255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:32.2411736Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:32.3491990Z dist init r=0, world=2 2022-08-17T13:34:32.3495510Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:34:32.4155377Z dist init r=1, world=2 2022-08-17T13:34:32.4159837Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:34:32.4160781Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:32.4209115Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:33.7780039Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:34:33.7780565Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:34:33.8005293Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:33.8006192Z warnings.warn( 2022-08-17T13:34:33.8007333Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:33.8008086Z warnings.warn( 2022-08-17T13:34:33.8032606Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:34:33.8033153Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:34:33.8035316Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:34:33.8035877Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:34:34.7150813Z ok (4.014s) 2022-08-17T13:34:34.7171824Z test_summon_from_non_fsdp (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69234 2022-08-17T13:34:34.7177868Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69235 2022-08-17T13:34:36.2069024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:36.2069540Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:36.2072147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:36.2072625Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:36.2238533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:36.2239021Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:36.2243343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:36.2243806Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:36.3742435Z dist init r=0, world=2 2022-08-17T13:34:36.3746766Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:34:36.3967965Z dist init r=1, world=2 2022-08-17T13:34:36.3972780Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:34:36.3973619Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:36.4053996Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:37.7733961Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:34:37.7734486Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:34:38.2266168Z ok (3.511s) 2022-08-17T13:34:38.2292334Z test_summon_full_param_recursive_recurse_False_summon_outer_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69313 2022-08-17T13:34:38.2298157Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69314 2022-08-17T13:34:39.6823920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:39.6824849Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:39.6827173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:39.6827680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:39.7038414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:39.7038873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:39.7042874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:39.7043340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:39.8486791Z dist init r=0, world=2 2022-08-17T13:34:39.8490407Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:34:39.8758189Z dist init r=1, world=2 2022-08-17T13:34:39.8763006Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:34:39.8763748Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:39.8797053Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:41.2391928Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:34:41.2392468Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:34:41.2606310Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:41.2607373Z warnings.warn( 2022-08-17T13:34:41.2608571Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:41.2609372Z warnings.warn( 2022-08-17T13:34:41.6381421Z ok (3.411s) 2022-08-17T13:34:41.6409912Z test_summon_full_param_recursive_recurse_False_summon_outer_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69392 2022-08-17T13:34:41.6415711Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69393 2022-08-17T13:34:43.0887271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:43.0887771Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:43.0890429Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:43.0890917Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:43.1021234Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:43.1021691Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:43.1025579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:43.1026067Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:43.2611275Z dist init r=0, world=2 2022-08-17T13:34:43.2615750Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:34:43.2670341Z dist init r=1, world=2 2022-08-17T13:34:43.2674579Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:34:43.2675747Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:43.2719340Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:44.6436544Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:34:44.6437078Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:34:44.6647130Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:44.6647961Z warnings.warn( 2022-08-17T13:34:44.6649085Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:44.6649833Z warnings.warn( 2022-08-17T13:34:45.0499323Z ok (3.412s) 2022-08-17T13:34:45.0526759Z test_summon_full_param_recursive_recurse_False_summon_outer_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69471 2022-08-17T13:34:45.0532955Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69472 2022-08-17T13:34:46.5012962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:46.5013463Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:46.5015791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:46.5016279Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:46.5408104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:46.5408570Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:46.5412673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:46.5413169Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:46.6679477Z dist init r=0, world=2 2022-08-17T13:34:46.6683250Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:34:46.7062189Z dist init r=1, world=2 2022-08-17T13:34:46.7065992Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:34:46.7067050Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:46.7091603Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:48.0573686Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:34:48.0574713Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:34:48.0596395Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:48.0598291Z warnings.warn( 2022-08-17T13:34:48.0765393Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:48.0767196Z warnings.warn( 2022-08-17T13:34:48.4616420Z ok (3.412s) 2022-08-17T13:34:48.4643167Z test_summon_full_param_recursive_recurse_False_summon_outer_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69550 2022-08-17T13:34:48.4648706Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69551 2022-08-17T13:34:49.8768351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:49.8768886Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:49.8771283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:49.8771776Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:49.9099272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:49.9100041Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:49.9104199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:49.9104692Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:50.0437028Z dist init r=1, world=2 2022-08-17T13:34:50.0440761Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:34:50.0825781Z dist init r=0, world=2 2022-08-17T13:34:50.0830351Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:34:50.0831311Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:50.0848516Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:51.4855942Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:34:51.4856496Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:34:51.5084838Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:51.5085645Z warnings.warn( 2022-08-17T13:34:51.5086765Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:51.5087522Z warnings.warn( 2022-08-17T13:34:51.9735102Z ok (3.512s) 2022-08-17T13:34:51.9763253Z test_summon_full_param_recursive_recurse_True_summon_outer_False_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69629 2022-08-17T13:34:51.9769201Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69630 2022-08-17T13:34:53.3989314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:53.3989842Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:53.3991943Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:53.3992743Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:53.3993325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:53.3994011Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:53.3998151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:53.3998775Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:53.5718732Z dist init r=0, world=2 2022-08-17T13:34:53.5722480Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:34:53.6188116Z dist init r=1, world=2 2022-08-17T13:34:53.6202090Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:34:53.6203308Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:53.6232896Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:55.0129142Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:34:55.0129717Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:34:55.0324645Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:55.0325453Z warnings.warn( 2022-08-17T13:34:55.0326570Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:55.0327318Z warnings.warn( 2022-08-17T13:34:55.4854627Z ok (3.512s) 2022-08-17T13:34:55.4882964Z test_summon_full_param_recursive_recurse_True_summon_outer_False_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69708 2022-08-17T13:34:55.4889098Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69709 2022-08-17T13:34:56.8987327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:56.8987867Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:56.8990597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:56.8991103Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:56.9361754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:34:56.9362212Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:34:56.9366990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:34:56.9367473Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:34:57.0670788Z dist init r=1, world=2 2022-08-17T13:34:57.0674848Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:34:57.1126788Z dist init r=0, world=2 2022-08-17T13:34:57.1131559Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:34:57.1132935Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:57.1184703Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:34:58.4938806Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:34:58.4939342Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:34:58.5164819Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:58.5165991Z warnings.warn( 2022-08-17T13:34:58.5167179Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:34:58.5167994Z warnings.warn( 2022-08-17T13:34:58.9976056Z ok (3.512s) 2022-08-17T13:34:59.0003573Z test_summon_full_param_recursive_recurse_True_summon_outer_True_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69787 2022-08-17T13:34:59.0009682Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69788 2022-08-17T13:35:00.4333701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:00.4334278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:00.4336595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:00.4337080Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:00.4886306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:00.4886759Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:00.4891135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:00.4891614Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:00.6006997Z dist init r=1, world=2 2022-08-17T13:35:00.6010845Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:00.6641829Z dist init r=0, world=2 2022-08-17T13:35:00.6646683Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:00.6647909Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:00.6724227Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:02.0481114Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:35:02.0481618Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:35:02.0684661Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:35:02.0685487Z warnings.warn( 2022-08-17T13:35:02.0686612Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:35:02.0687363Z warnings.warn( 2022-08-17T13:35:02.5096974Z ok (3.512s) 2022-08-17T13:35:02.5123871Z test_summon_full_param_recursive_recurse_True_summon_outer_True_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69866 2022-08-17T13:35:02.5129844Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69867 2022-08-17T13:35:03.9258083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:03.9258561Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:03.9261175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:03.9261661Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:03.9702366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:03.9702814Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:03.9707812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:03.9708295Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:04.0921448Z dist init r=1, world=2 2022-08-17T13:35:04.0925515Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:04.1459901Z dist init r=0, world=2 2022-08-17T13:35:04.1464895Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:04.1466925Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:04.1537189Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:05.5308946Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:35:05.5309487Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:35:05.5524502Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:35:05.5525276Z warnings.warn( 2022-08-17T13:35:05.5526370Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:35:05.5527119Z warnings.warn( 2022-08-17T13:35:06.0217538Z ok (3.512s) 2022-08-17T13:35:06.0242566Z test_summon_full_param_shard_value_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69945 2022-08-17T13:35:06.0248591Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69946 2022-08-17T13:35:07.4518229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:07.4518739Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:07.4520653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:07.4521145Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:07.4694110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:07.4694822Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:07.4699178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:07.4699671Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:07.6213022Z dist init r=1, world=2 2022-08-17T13:35:07.6216465Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:07.6433796Z dist init r=0, world=2 2022-08-17T13:35:07.6438635Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:07.6439443Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:07.6523353Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:09.0209875Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:35:09.0210406Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:35:09.4334644Z ok (3.412s) 2022-08-17T13:35:09.4358929Z test_summon_full_param_shard_value_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70024 2022-08-17T13:35:09.4364553Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70025 2022-08-17T13:35:10.8566783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:10.8567287Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:10.8569560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:10.8570066Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:10.8890695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:10.8891147Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:10.8905606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:10.8906151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:11.0235762Z dist init r=1, world=2 2022-08-17T13:35:11.0239378Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:11.0641114Z dist init r=0, world=2 2022-08-17T13:35:11.0645914Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:11.0646774Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:11.0647499Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:12.4547203Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:35:12.4547766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:35:12.9456637Z ok (3.512s) 2022-08-17T13:35:12.9475828Z test_summon_full_param_writeback_writeback_False_cpu_offload_CPUOffload(offload_params=False)_mixed_precision_False_modify_outer_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70103 2022-08-17T13:35:12.9482075Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70104 2022-08-17T13:35:14.3729898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:14.3730685Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:14.3732439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:14.3732930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:14.4009215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:14.4009692Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:14.4013863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:14.4014342Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:14.5385626Z dist init r=1, world=2 2022-08-17T13:35:14.5389492Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:14.5744294Z dist init r=0, world=2 2022-08-17T13:35:14.5749356Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:14.5750416Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:14.5797988Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:15.9586551Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:35:15.9587065Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:35:16.4570453Z ok (3.511s) 2022-08-17T13:35:16.4589699Z test_summon_full_param_writeback_writeback_False_cpu_offload_CPUOffload(offload_params=False)_mixed_precision_False_modify_outer_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70182 2022-08-17T13:35:16.4595511Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70183 2022-08-17T13:35:17.9274079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:17.9274585Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:17.9277186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:17.9277669Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:17.9613815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:17.9614278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:17.9618282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:17.9618769Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:18.0938880Z dist init r=1, world=2 2022-08-17T13:35:18.0942967Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:18.1329670Z dist init r=0, world=2 2022-08-17T13:35:18.1334277Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:18.1335277Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:18.1350470Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:19.5232272Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:35:19.9683475Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:35:19.9684204Z ok (3.511s) 2022-08-17T13:35:19.9702829Z test_summon_full_param_writeback_writeback_False_cpu_offload_CPUOffload(offload_params=False)_mixed_precision_True_modify_outer_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70261 2022-08-17T13:35:19.9708996Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70262 2022-08-17T13:35:21.3875712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:21.3876200Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:21.3878269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:21.3878754Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:21.4069144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:21.4069614Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:21.4073224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:21.4073703Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:21.5557924Z dist init r=1, world=2 2022-08-17T13:35:21.5561596Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:21.5809100Z dist init r=0, world=2 2022-08-17T13:35:21.5813688Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:21.5814605Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:21.5867395Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:22.9630336Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:35:22.9630898Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:35:23.3794537Z ok (3.411s) 2022-08-17T13:35:23.3813607Z test_summon_full_param_writeback_writeback_False_cpu_offload_CPUOffload(offload_params=False)_mixed_precision_True_modify_outer_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70340 2022-08-17T13:35:23.3819383Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70341 2022-08-17T13:35:24.8395781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:24.8396290Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:24.8396877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:24.8397377Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:24.8732478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:24.8732965Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:24.8736217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:24.8736701Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:25.0059018Z dist init r=1, world=2 2022-08-17T13:35:25.0062959Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:25.0445737Z dist init r=0, world=2 2022-08-17T13:35:25.0450197Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:25.0451137Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:25.0470268Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:26.4232681Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:35:26.4233200Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:35:26.8905938Z ok (3.511s) 2022-08-17T13:35:26.8924934Z test_summon_full_param_writeback_writeback_False_cpu_offload_CPUOffload(offload_params=True)_mixed_precision_False_modify_outer_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70419 2022-08-17T13:35:26.8930623Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70420 2022-08-17T13:35:28.3371885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:28.3372402Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:28.3374549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:28.3375009Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:28.3675838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:28.3676290Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:28.3680366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:28.3680815Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:28.5051148Z dist init r=0, world=2 2022-08-17T13:35:28.5056063Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:28.5395176Z dist init r=1, world=2 2022-08-17T13:35:28.5399055Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:28.5399835Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:28.5463711Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:29.9021500Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:35:29.9022035Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:35:30.4017877Z ok (3.511s) 2022-08-17T13:35:30.4037766Z test_summon_full_param_writeback_writeback_False_cpu_offload_CPUOffload(offload_params=True)_mixed_precision_False_modify_outer_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70498 2022-08-17T13:35:30.4043620Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70499 2022-08-17T13:35:31.8480342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:31.8480865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:31.8483511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:31.8483985Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:31.8582376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:31.8582834Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:31.8587497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:31.8588136Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:32.0207647Z dist init r=0, world=2 2022-08-17T13:35:32.0212002Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:32.0237882Z dist init r=1, world=2 2022-08-17T13:35:32.0242392Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:32.0243570Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:32.0315622Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:33.4027529Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:35:33.4028070Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:35:33.8127654Z ok (3.411s) 2022-08-17T13:35:33.8146536Z test_summon_full_param_writeback_writeback_False_cpu_offload_CPUOffload(offload_params=True)_mixed_precision_True_modify_outer_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70577 2022-08-17T13:35:33.8152269Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70578 2022-08-17T13:35:35.2383599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:35.2384388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:35.2385911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:35.2386622Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:35.2718037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:35.2718596Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:35.2721997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:35.2722727Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:35.4064134Z dist init r=0, world=2 2022-08-17T13:35:35.4067383Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:35.4472102Z dist init r=1, world=2 2022-08-17T13:35:35.4476666Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:35.4477844Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:35.4577938Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:36.8347450Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:35:36.8348291Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:35:37.3242458Z ok (3.511s) 2022-08-17T13:35:37.3261846Z test_summon_full_param_writeback_writeback_False_cpu_offload_CPUOffload(offload_params=True)_mixed_precision_True_modify_outer_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70656 2022-08-17T13:35:37.3267858Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70657 2022-08-17T13:35:38.7966021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:38.7966531Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:38.7969020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:38.7969778Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:38.8401908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:38.8402371Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:38.8406391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:38.8406862Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:38.9690927Z dist init r=0, world=2 2022-08-17T13:35:38.9695208Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:39.0106781Z dist init r=1, world=2 2022-08-17T13:35:39.0111051Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:39.0112049Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:39.0205485Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:40.3956195Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:35:40.3956716Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:35:40.8354617Z ok (3.511s) 2022-08-17T13:35:40.8373461Z test_summon_full_param_writeback_writeback_True_cpu_offload_CPUOffload(offload_params=False)_mixed_precision_False_modify_outer_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70735 2022-08-17T13:35:40.8379467Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70736 2022-08-17T13:35:42.2839734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:42.2840224Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:42.2842778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:42.2843495Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:42.3296917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:42.3297570Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:42.3301383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:42.3302097Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:42.4502312Z dist init r=0, world=2 2022-08-17T13:35:42.4506688Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:42.5016774Z dist init r=1, world=2 2022-08-17T13:35:42.5021489Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:42.5022758Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:42.5118452Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:43.9056460Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:35:43.9057024Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:35:44.3466844Z ok (3.511s) 2022-08-17T13:35:44.3485772Z test_summon_full_param_writeback_writeback_True_cpu_offload_CPUOffload(offload_params=False)_mixed_precision_False_modify_outer_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70814 2022-08-17T13:35:44.3491537Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70815 2022-08-17T13:35:45.8289600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:45.8290100Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:45.8293006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:45.8293493Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:45.8294078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:45.8294512Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:45.8298517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:45.8299287Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:46.0093619Z dist init r=0, world=2 2022-08-17T13:35:46.0097065Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:46.0109543Z dist init r=1, world=2 2022-08-17T13:35:46.0114095Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:46.0115393Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:46.0200624Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:47.3989091Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:35:47.3989636Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:35:47.8578836Z ok (3.511s) 2022-08-17T13:35:47.8598414Z test_summon_full_param_writeback_writeback_True_cpu_offload_CPUOffload(offload_params=False)_mixed_precision_True_modify_outer_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70893 2022-08-17T13:35:47.8604237Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70894 2022-08-17T13:35:49.2711854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:49.2712349Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:49.2715113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:49.2715596Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:49.3002551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:49.3003016Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:49.3007181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:49.3007687Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:49.4375934Z dist init r=0, world=2 2022-08-17T13:35:49.4380096Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:49.4734536Z dist init r=1, world=2 2022-08-17T13:35:49.4738948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:49.4739845Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:49.4788799Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:50.8461529Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:35:50.8462426Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:35:51.2687683Z ok (3.411s) 2022-08-17T13:35:51.2707607Z test_summon_full_param_writeback_writeback_True_cpu_offload_CPUOffload(offload_params=False)_mixed_precision_True_modify_outer_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70972 2022-08-17T13:35:51.2713528Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70973 2022-08-17T13:35:52.7509057Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:52.7509564Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:52.7512049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:52.7512553Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:52.7638442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:52.7638901Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:52.7642953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:52.7643438Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:52.9173856Z dist init r=0, world=2 2022-08-17T13:35:52.9177523Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:52.9374010Z dist init r=1, world=2 2022-08-17T13:35:52.9378481Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:52.9379248Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:52.9382255Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:54.3058615Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:35:54.3059141Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:35:54.7800897Z ok (3.511s) 2022-08-17T13:35:54.7820212Z test_summon_full_param_writeback_writeback_True_cpu_offload_CPUOffload(offload_params=True)_mixed_precision_False_modify_outer_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71051 2022-08-17T13:35:54.7826430Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71052 2022-08-17T13:35:56.2323392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:56.2324138Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:56.2326306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:56.2326802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:56.2672597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:56.2673076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:56.2677191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:56.2677686Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:56.3983462Z dist init r=0, world=2 2022-08-17T13:35:56.3988126Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:56.4395667Z dist init r=1, world=2 2022-08-17T13:35:56.4400162Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:56.4401429Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:56.4498103Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:57.8226816Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:35:57.8227357Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:35:58.2913708Z ok (3.511s) 2022-08-17T13:35:58.2932699Z test_summon_full_param_writeback_writeback_True_cpu_offload_CPUOffload(offload_params=True)_mixed_precision_False_modify_outer_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71130 2022-08-17T13:35:58.2938520Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71131 2022-08-17T13:35:59.7202153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:59.7202689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:59.7205704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:59.7206204Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:59.7299881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:35:59.7300350Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:35:59.7304529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:35:59.7305020Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:35:59.9017061Z dist init r=0, world=2 2022-08-17T13:35:59.9020748Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:35:59.9301773Z dist init r=1, world=2 2022-08-17T13:35:59.9307302Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:35:59.9308540Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:35:59.9327392Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:01.3255082Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:36:01.3255629Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:36:01.8025874Z ok (3.511s) 2022-08-17T13:36:01.8045702Z test_summon_full_param_writeback_writeback_True_cpu_offload_CPUOffload(offload_params=True)_mixed_precision_True_modify_outer_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71209 2022-08-17T13:36:01.8051302Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71210 2022-08-17T13:36:03.2556520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:03.2557027Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:03.2559678Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:03.2560155Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:03.3455030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:03.3455523Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:03.3459507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:03.3459971Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:03.4218654Z dist init r=0, world=2 2022-08-17T13:36:03.4222773Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:36:03.5200024Z dist init r=1, world=2 2022-08-17T13:36:03.5204839Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:36:03.5205935Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:03.5240311Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:04.8756648Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:36:04.8757216Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:36:05.3136449Z ok (3.511s) 2022-08-17T13:36:05.3157967Z test_summon_full_param_writeback_writeback_True_cpu_offload_CPUOffload(offload_params=True)_mixed_precision_True_modify_outer_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71288 2022-08-17T13:36:05.3163912Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71289 2022-08-17T13:36:06.7608797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:06.7609298Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:06.7612175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:06.7612651Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:06.8064997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:06.8065469Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:06.8069614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:06.8070078Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:06.9290256Z dist init r=1, world=2 2022-08-17T13:36:06.9294289Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:36:06.9804365Z dist init r=0, world=2 2022-08-17T13:36:06.9808783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:36:06.9809745Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:06.9906471Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:08.3549874Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:36:08.3550402Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:36:08.8248659Z ok (3.511s) 2022-08-17T13:36:08.8274703Z test_summon_full_params_equivalence_rank0_only_False_offload_to_cpu_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71367 2022-08-17T13:36:08.8280572Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71368 2022-08-17T13:36:10.2774877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:10.2775400Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:10.2778207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:10.2778671Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:10.3165356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:10.3165822Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:10.3169878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:10.3170340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:10.4433816Z dist init r=0, world=2 2022-08-17T13:36:10.4437869Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:36:10.4889488Z dist init r=1, world=2 2022-08-17T13:36:10.4893807Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:36:10.4894961Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:10.4947308Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:11.8514225Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:36:11.8514758Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:36:12.3368218Z ok (3.512s) 2022-08-17T13:36:12.3394082Z test_summon_full_params_equivalence_rank0_only_False_offload_to_cpu_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71446 2022-08-17T13:36:12.3400209Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71447 2022-08-17T13:36:13.7551046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:13.7551544Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:13.7554194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:13.7554680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:13.8236045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:13.8236503Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:13.8240330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:13.8240829Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:13.9262669Z dist init r=1, world=2 2022-08-17T13:36:13.9266746Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:36:13.9981887Z dist init r=0, world=2 2022-08-17T13:36:13.9986573Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:36:13.9987578Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:14.0081666Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:15.3708909Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:36:15.3709442Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:36:15.3962649Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:2505: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-08-17T13:36:15.3963853Z warnings.warn( 2022-08-17T13:36:15.3965604Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:2505: UserWarning: offload_to_cpu and rank0_only=False will result in full parameters being redundantly copied to CPU memory for GPUs that reside on the same machine, which may incur the risk of CPU OOM. It is recommended to use ``offload_to_cpu`` with rank0_only=True. 2022-08-17T13:36:15.3966902Z warnings.warn( 2022-08-17T13:36:15.8487024Z ok (3.512s) 2022-08-17T13:36:15.8514150Z test_summon_full_params_equivalence_rank0_only_True_offload_to_cpu_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71525 2022-08-17T13:36:15.8520152Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71526 2022-08-17T13:36:17.2666640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:17.2667127Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:17.2669853Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:17.2670338Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:17.2892742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:17.2893196Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:17.2897227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:17.2897712Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:17.4335803Z dist init r=0, world=2 2022-08-17T13:36:17.4339720Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:36:17.4611726Z dist init r=1, world=2 2022-08-17T13:36:17.4616212Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:36:17.4617307Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:17.4646268Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:18.8245908Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:36:18.8246913Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:36:19.2604307Z ok (3.412s) 2022-08-17T13:36:19.2630032Z test_summon_full_params_equivalence_rank0_only_True_offload_to_cpu_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71604 2022-08-17T13:36:19.2635870Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71605 2022-08-17T13:36:20.6793612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:20.6794122Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:20.6796444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:20.6796932Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:20.7143812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:20.7144276Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:20.7148362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:20.7148851Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:20.8470231Z dist init r=1, world=2 2022-08-17T13:36:20.8474004Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:36:20.8891746Z dist init r=0, world=2 2022-08-17T13:36:20.8896350Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:36:20.8897275Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:20.8984023Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:22.2638385Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:36:22.2638934Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:36:22.7723767Z ok (3.512s) 2022-08-17T13:36:22.7748456Z test_summon_full_params_respects_reshard_after_forward_mixed_precision_False (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71683 2022-08-17T13:36:22.7754752Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71684 2022-08-17T13:36:24.1995096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:24.1995619Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:24.1997809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:24.1998316Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:24.2245462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:24.2245932Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:24.2249929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:24.2250407Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:24.3679834Z dist init r=1, world=2 2022-08-17T13:36:24.3683555Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:36:24.3972080Z dist init r=0, world=2 2022-08-17T13:36:24.3976782Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:36:24.3977576Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:24.3991192Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:25.7868773Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:36:25.7869306Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:36:25.8085523Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:36:25.8086638Z warnings.warn( 2022-08-17T13:36:25.8116018Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:36:25.8116601Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:36:25.8121006Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:36:25.8121777Z warnings.warn( 2022-08-17T13:36:25.8149827Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:36:25.8150385Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:36:26.6853817Z ok (3.913s) 2022-08-17T13:36:26.6878424Z test_summon_full_params_respects_reshard_after_forward_mixed_precision_True (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71762 2022-08-17T13:36:26.6884285Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71763 2022-08-17T13:36:28.1441140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:28.1442121Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:28.1443786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:28.1444706Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:28.1608596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:28.1609540Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:28.1613202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:28.1614198Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:28.3124565Z dist init r=0, world=2 2022-08-17T13:36:28.3128842Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:36:28.3334189Z dist init r=1, world=2 2022-08-17T13:36:28.3338485Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:36:28.3339563Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:28.3435898Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:29.7015644Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:36:29.7017021Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:36:29.7206648Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:36:29.7207460Z warnings.warn( 2022-08-17T13:36:29.7239595Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:36:29.7240393Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:36:29.7248105Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:36:29.7248869Z warnings.warn( 2022-08-17T13:36:29.7295264Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:36:29.7295826Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:36:30.5982052Z ok (3.913s) 2022-08-17T13:36:30.6003173Z test_summon_single_param (__main__.TestSummonFullParams) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71841 2022-08-17T13:36:30.6009270Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71842 2022-08-17T13:36:32.0341563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:32.0342083Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:32.0344698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:32.0345191Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:32.1151792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:32.1152284Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:32.1155363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:32.1155866Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:32.2053337Z dist init r=0, world=2 2022-08-17T13:36:32.2057527Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:36:32.2882279Z dist init r=1, world=2 2022-08-17T13:36:32.2886824Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:36:32.2887557Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:32.2974587Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:36:33.6524679Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:36:33.6525213Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:36:33.6727952Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:36:33.6728765Z warnings.warn( 2022-08-17T13:36:33.6729878Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:36:33.6730743Z warnings.warn( 2022-08-17T13:36:34.1097434Z ok (3.511s) 2022-08-17T13:36:34.1117676Z test_summon_full_param_writeback_writeback_False_modify_outer_False_mixed_precision_False (__main__.TestSummonFullParamsNoShard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71920 2022-08-17T13:36:35.5201049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:35.5201701Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:35.5203897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:35.5204380Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:35.6868815Z dist init r=0, world=1 2022-08-17T13:36:35.6873324Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:36:35.6874216Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:36:36.9241382Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:36:37.3192654Z ok (3.209s) 2022-08-17T13:36:37.3211777Z test_summon_full_param_writeback_writeback_False_modify_outer_False_mixed_precision_True (__main__.TestSummonFullParamsNoShard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71960 2022-08-17T13:36:38.7525824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:38.7526312Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:38.7528803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:38.7529293Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:38.9247699Z dist init r=0, world=1 2022-08-17T13:36:38.9251827Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:36:38.9252721Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:36:40.2033594Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:36:40.6289158Z ok (3.309s) 2022-08-17T13:36:40.6307448Z test_summon_full_param_writeback_writeback_False_modify_outer_True_mixed_precision_False (__main__.TestSummonFullParamsNoShard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72000 2022-08-17T13:36:42.0346520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:42.0347036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:42.0349460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:42.0349947Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:42.2003850Z dist init r=0, world=1 2022-08-17T13:36:42.2007482Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:36:42.2008319Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:36:43.4529272Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:36:43.8384455Z ok (3.209s) 2022-08-17T13:36:43.8403135Z test_summon_full_param_writeback_writeback_False_modify_outer_True_mixed_precision_True (__main__.TestSummonFullParamsNoShard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72040 2022-08-17T13:36:45.2729133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:45.2729894Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:45.2732204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:45.2732691Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:45.4450283Z dist init r=0, world=1 2022-08-17T13:36:45.4454636Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:36:45.4455378Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:36:46.7172455Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:36:47.1480155Z ok (3.309s) 2022-08-17T13:36:47.1499609Z test_summon_full_param_writeback_writeback_True_modify_outer_False_mixed_precision_False (__main__.TestSummonFullParamsNoShard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72080 2022-08-17T13:36:48.5867162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:48.5867669Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:48.5870922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:48.5871387Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:48.7597772Z dist init r=0, world=1 2022-08-17T13:36:48.7601582Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:36:48.7602836Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:36:50.0310578Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:36:50.4576683Z ok (3.310s) 2022-08-17T13:36:50.4595713Z test_summon_full_param_writeback_writeback_True_modify_outer_False_mixed_precision_True (__main__.TestSummonFullParamsNoShard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72120 2022-08-17T13:36:51.8898242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:51.8901330Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:51.8901941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:51.8902421Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:52.0609042Z dist init r=0, world=1 2022-08-17T13:36:52.0613266Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:36:52.0614334Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:36:53.3276959Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:36:53.6672634Z ok (3.209s) 2022-08-17T13:36:53.6690968Z test_summon_full_param_writeback_writeback_True_modify_outer_True_mixed_precision_False (__main__.TestSummonFullParamsNoShard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72160 2022-08-17T13:36:55.0719370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:55.0720660Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:55.0722333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:55.0723278Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:55.2462659Z dist init r=0, world=1 2022-08-17T13:36:55.2467808Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:36:55.2469140Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:36:56.5300778Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:36:56.8769478Z ok (3.210s) 2022-08-17T13:36:56.8789023Z test_summon_full_param_writeback_writeback_True_modify_outer_True_mixed_precision_True (__main__.TestSummonFullParamsNoShard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72200 2022-08-17T13:36:58.2809985Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:36:58.2810501Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:36:58.2813091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:36:58.2813795Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:36:58.4468519Z dist init r=0, world=1 2022-08-17T13:36:58.4472781Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:36:58.4473529Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:36:59.6999982Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:37:00.0863845Z ok (3.209s) 2022-08-17T13:37:00.0864068Z 2022-08-17T13:37:00.0864457Z ---------------------------------------------------------------------- 2022-08-17T13:37:00.0864802Z Ran 73 tests in 258.575s 2022-08-17T13:37:00.0864974Z 2022-08-17T13:37:00.0865051Z OK 2022-08-17T13:37:00.0865189Z 2022-08-17T13:37:00.0867258Z Generating XML reports... 2022-08-17T13:37:00.0973079Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_summon_full_params/TEST-TestSummonFullParams-20220817133241.xml 2022-08-17T13:37:00.0983040Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_summon_full_params/TEST-TestSummonFullParamsNoShard-20220817133241.xml 2022-08-17T13:37:00.4501597Z Running distributed/fsdp/test_fsdp_state_dict ... [2022-08-17 13:37:00.449705] 2022-08-17T13:37:00.4502348Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_state_dict.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:37:00.449776] 2022-08-17T13:37:02.0766728Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_state_dict 2022-08-17T13:37:02.0789892Z 2022-08-17T13:37:02.0790043Z Running tests... 2022-08-17T13:37:02.0790743Z ---------------------------------------------------------------------- 2022-08-17T13:37:02.0811269Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:37:03.6045105Z Tests that we can save a state_dict and load it into a blank model ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:37:03.6227805Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72275 2022-08-17T13:37:03.6234135Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72276 2022-08-17T13:37:05.0648318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:05.0648812Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:05.0651255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:05.0651742Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:05.1053314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:05.1053775Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:05.1056520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:05.1056993Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:05.2366978Z dist init r=1, world=2 2022-08-17T13:37:05.2370711Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:37:05.2769468Z dist init r=0, world=2 2022-08-17T13:37:05.2774074Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:37:05.2774797Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:05.2778796Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:06.6404237Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:37:06.6404770Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:37:07.1325578Z ok (5.053s) 2022-08-17T13:37:07.1344458Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:37:07.1358943Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72354 2022-08-17T13:37:07.1364810Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72355 2022-08-17T13:37:08.5563334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:08.5563852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:08.5566256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:08.5566739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:08.5795463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:08.5795900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:08.5799822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:08.5800303Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:08.7236047Z dist init r=0, world=2 2022-08-17T13:37:08.7240005Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:37:08.7520244Z dist init r=1, world=2 2022-08-17T13:37:08.7524915Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:37:08.7526226Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:08.7546572Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:10.1172395Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:37:10.1172913Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:37:10.5450332Z ok (3.412s) 2022-08-17T13:37:10.5468591Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:37:10.5482756Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72433 2022-08-17T13:37:10.5489129Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72434 2022-08-17T13:37:11.9983434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:11.9984233Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:11.9986735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:11.9987225Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:12.0109897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:12.0110362Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:12.0114649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:12.0115127Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:12.1653852Z dist init r=1, world=2 2022-08-17T13:37:12.1657862Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:37:12.1830083Z dist init r=0, world=2 2022-08-17T13:37:12.1834579Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:37:12.1835612Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:12.1862729Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:13.5668660Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:37:13.5669213Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:37:14.0578148Z ok (3.513s) 2022-08-17T13:37:14.0596476Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:37:14.0609688Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72512 2022-08-17T13:37:14.0616614Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72513 2022-08-17T13:37:15.4918760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:15.4919262Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:15.4921370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:15.4921878Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:15.5657166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:15.5658127Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:15.5660553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:15.5661506Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:15.6622433Z dist init r=1, world=2 2022-08-17T13:37:15.6625986Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:37:15.7421333Z dist init r=0, world=2 2022-08-17T13:37:15.7425871Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:37:15.7427080Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:15.7440598Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:17.1209098Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:37:17.1209672Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:37:17.5704437Z ok (3.512s) 2022-08-17T13:37:17.5722416Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:37:17.5736138Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72591 2022-08-17T13:37:17.5742300Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72592 2022-08-17T13:37:19.0272038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:19.0272723Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:19.0274988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:19.0275453Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:19.0439476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:19.0439932Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:19.0444089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:19.0444546Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:19.1943830Z dist init r=1, world=2 2022-08-17T13:37:19.1948068Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:37:19.2162033Z dist init r=0, world=2 2022-08-17T13:37:19.2166788Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:37:19.2167550Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:19.2255038Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:20.5956163Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:37:20.5956719Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:37:21.0830774Z ok (3.513s) 2022-08-17T13:37:21.0849069Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:37:21.0863023Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72670 2022-08-17T13:37:21.0869646Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72671 2022-08-17T13:37:22.5189682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:22.5190648Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:22.5192892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:22.5390013Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:22.5391233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:22.5392493Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:22.5394739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:22.5395707Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:22.6856410Z dist init r=1, world=2 2022-08-17T13:37:22.6860197Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:37:22.7139946Z dist init r=0, world=2 2022-08-17T13:37:22.7144841Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:37:22.7145944Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:22.7166794Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:24.1057497Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:37:24.1058366Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:37:24.4954908Z ok (3.412s) 2022-08-17T13:37:24.4972711Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:37:24.4986666Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72749 2022-08-17T13:37:24.4992889Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72750 2022-08-17T13:37:25.9923669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:25.9924179Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:25.9927315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:25.9927808Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:26.0073080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:26.0073543Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:26.0077517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:26.0077982Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:26.1590094Z dist init r=1, world=2 2022-08-17T13:37:26.1594521Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:37:26.1800607Z dist init r=0, world=2 2022-08-17T13:37:26.1805153Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:37:26.1806018Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:26.1901922Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:27.5568482Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:37:27.5569005Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:37:28.0079206Z ok (3.512s) 2022-08-17T13:37:28.0096998Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:37:28.0110663Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72828 2022-08-17T13:37:28.0117100Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72829 2022-08-17T13:37:29.3824072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:29.3824577Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:29.3830564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:29.3831373Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:29.4035265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:29.4035738Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:29.4039470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:29.4039965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:29.5497140Z dist init r=0, world=2 2022-08-17T13:37:29.5500436Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:37:29.5761316Z dist init r=1, world=2 2022-08-17T13:37:29.5765861Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:37:29.5767053Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:29.5807528Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:30.9346170Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:37:30.9346677Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:37:31.3199274Z ok (3.312s) 2022-08-17T13:37:31.3217459Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:37:31.3231183Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72907 2022-08-17T13:37:31.3237592Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72908 2022-08-17T13:37:32.7972535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:32.7973036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:32.7975321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:32.7975800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:32.8359855Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:32.8360345Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:32.8364445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:32.8364953Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:32.9637014Z dist init r=0, world=2 2022-08-17T13:37:32.9640932Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:37:33.0065612Z dist init r=1, world=2 2022-08-17T13:37:33.0070199Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:37:33.0071040Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:33.0150586Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:34.3639150Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:37:34.3639662Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:37:34.9326200Z ok (3.613s) 2022-08-17T13:37:34.9345426Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:37:34.9360224Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72986 2022-08-17T13:37:34.9366241Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72987 2022-08-17T13:37:36.3776442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:36.3777013Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:36.3779633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:36.3780120Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:36.4155833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:36.4156313Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:36.4160228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:36.4160712Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:36.5497085Z dist init r=1, world=2 2022-08-17T13:37:36.5501463Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:37:36.5836418Z dist init r=0, world=2 2022-08-17T13:37:36.5840797Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:37:36.5841522Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:36.5910212Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:37.9616563Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:37:37.9617131Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:37:38.4462220Z ok (3.514s) 2022-08-17T13:37:38.4480159Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:37:38.4493256Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73065 2022-08-17T13:37:38.4499332Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73066 2022-08-17T13:37:39.8849674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:39.8850359Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:39.8852521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:39.8853017Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:39.9119284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:39.9119746Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:39.9123981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:39.9124478Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:40.0545865Z dist init r=0, world=2 2022-08-17T13:37:40.0550745Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:37:40.0848357Z dist init r=1, world=2 2022-08-17T13:37:40.0852919Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:37:40.0853926Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:40.0857193Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:41.4721486Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:37:41.4722037Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:37:42.0588392Z ok (3.612s) 2022-08-17T13:37:42.0606677Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:37:42.0620135Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73144 2022-08-17T13:37:42.0626632Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73145 2022-08-17T13:37:43.4918870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:43.4919861Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:43.4921653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:43.4922602Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:43.5877405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:43.5878348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:43.5881462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:43.5882432Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:43.6579645Z dist init r=0, world=2 2022-08-17T13:37:43.6584336Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:37:43.7591681Z dist init r=1, world=2 2022-08-17T13:37:43.7596109Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:37:43.7597089Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:43.7602129Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:45.1212730Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:37:45.1213261Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:37:45.5714504Z ok (3.512s) 2022-08-17T13:37:45.5732350Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:37:45.5745916Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73223 2022-08-17T13:37:45.5752857Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73224 2022-08-17T13:37:47.0720639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:47.0721148Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:47.0723390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:47.0723880Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:47.1199725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:47.1200174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:47.1204276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:47.1204764Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:47.2378284Z dist init r=0, world=2 2022-08-17T13:37:47.2382128Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:37:47.2946736Z dist init r=1, world=2 2022-08-17T13:37:47.2951548Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:37:47.2952618Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:47.2994206Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:48.6545370Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:37:48.6545903Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:37:49.1843229Z ok (3.613s) 2022-08-17T13:37:49.1861374Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:37:49.1875499Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73302 2022-08-17T13:37:49.1881644Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73303 2022-08-17T13:37:50.6194766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:50.6195309Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:50.6197442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:50.6197930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:50.6455054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:50.6455539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:50.6460182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:50.6460695Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:50.7885196Z dist init r=1, world=2 2022-08-17T13:37:50.7888900Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:37:50.8177138Z dist init r=0, world=2 2022-08-17T13:37:50.8181624Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:37:50.8182356Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:50.8195442Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:52.1999171Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:37:52.1999709Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:37:52.5965352Z ok (3.412s) 2022-08-17T13:37:52.5982345Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:37:52.5995962Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73381 2022-08-17T13:37:52.6002028Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73382 2022-08-17T13:37:54.0727658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:54.0728176Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:54.0730162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:54.0730656Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:54.1195490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:54.1195980Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:54.1199558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:54.1200043Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:54.2472480Z dist init r=0, world=2 2022-08-17T13:37:54.2476521Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:37:54.2893378Z dist init r=1, world=2 2022-08-17T13:37:54.2897763Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:37:54.2898475Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:54.2987008Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:55.6840020Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:37:55.6840534Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:37:56.2091265Z ok (3.612s) 2022-08-17T13:37:56.2108620Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:37:56.2121904Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73460 2022-08-17T13:37:56.2127959Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73461 2022-08-17T13:37:57.6348268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:57.6348987Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:57.6353172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:57.6353919Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:57.6642107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:37:57.6642561Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:37:57.6647042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:37:57.6647783Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:37:57.8007352Z dist init r=1, world=2 2022-08-17T13:37:57.8011410Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:37:57.8383387Z dist init r=0, world=2 2022-08-17T13:37:57.8388455Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:37:57.8389248Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:57.8419716Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:37:59.2180144Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:37:59.2180659Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:37:59.6215305Z ok (3.412s) 2022-08-17T13:37:59.6233306Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:37:59.6246851Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73539 2022-08-17T13:37:59.6253011Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73540 2022-08-17T13:38:01.0640191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:01.0640687Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:01.0643076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:01.0643560Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:01.0896970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:01.0897443Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:01.0901625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:01.0902103Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:01.2298651Z dist init r=0, world=2 2022-08-17T13:38:01.2302446Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:38:01.2609661Z dist init r=1, world=2 2022-08-17T13:38:01.2614445Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:38:01.2615336Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:01.2710798Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:02.6333032Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:38:02.6333611Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:38:03.1341263Z ok (3.513s) 2022-08-17T13:38:03.1358944Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:38:03.1371541Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73618 2022-08-17T13:38:03.1377852Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73619 2022-08-17T13:38:04.5700767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:04.5701558Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:04.5703828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:04.5704314Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:04.5795063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:04.5795526Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:04.5799497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:04.5799973Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:04.7380279Z dist init r=1, world=2 2022-08-17T13:38:04.7383699Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:38:04.7523460Z dist init r=0, world=2 2022-08-17T13:38:04.7528158Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:38:04.7529326Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:04.7589080Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:06.1131328Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:38:06.1131858Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:38:06.6465725Z ok (3.512s) 2022-08-17T13:38:06.6483718Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:38:06.6497059Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73697 2022-08-17T13:38:06.6503380Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73698 2022-08-17T13:38:08.0591596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:08.0592073Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:08.0594447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:08.0594938Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:08.0936083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:08.0936536Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:08.0940579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:08.0941267Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:08.2249278Z dist init r=1, world=2 2022-08-17T13:38:08.2253333Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:38:08.2653269Z dist init r=0, world=2 2022-08-17T13:38:08.2657619Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:38:08.2658812Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:08.2661399Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:09.6412834Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:38:09.6413690Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:38:10.1591131Z ok (3.512s) 2022-08-17T13:38:10.1608634Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:38:10.1621903Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73776 2022-08-17T13:38:10.1628285Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73777 2022-08-17T13:38:11.5725511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:11.5726022Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:11.5728302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:11.5728930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:11.6013409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:11.6014174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:11.6018230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:11.6018771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:11.7397144Z dist init r=0, world=2 2022-08-17T13:38:11.7401070Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:38:11.7745195Z dist init r=1, world=2 2022-08-17T13:38:11.7750179Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:38:11.7751029Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:11.7809717Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:13.1378588Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:38:13.1379123Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:38:13.2273520Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:38:13.2274821Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:38:13.6718502Z ok (3.513s) 2022-08-17T13:38:13.6736448Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:38:13.6749502Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73855 2022-08-17T13:38:13.6756144Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73856 2022-08-17T13:38:15.1325026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:15.1325626Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:15.1327478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:15.1327965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:15.1728735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:15.1729184Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:15.1733537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:15.1734017Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:15.2998362Z dist init r=1, world=2 2022-08-17T13:38:15.3002118Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:38:15.3445930Z dist init r=0, world=2 2022-08-17T13:38:15.3450885Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:38:15.3451642Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:15.3512191Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:16.7316373Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:38:16.7316906Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:38:17.2845307Z ok (3.613s) 2022-08-17T13:38:17.2863227Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:38:17.2876565Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73934 2022-08-17T13:38:17.2882978Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73935 2022-08-17T13:38:18.7822656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:18.7823175Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:18.7825806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:18.7826291Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:18.8031909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:18.8032382Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:18.8036287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:18.8036762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:18.9526327Z dist init r=0, world=2 2022-08-17T13:38:18.9530234Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:38:18.9771505Z dist init r=1, world=2 2022-08-17T13:38:18.9775957Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:38:18.9777289Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:18.9836872Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:20.3632581Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:38:20.3633110Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:38:20.8973985Z ok (3.613s) 2022-08-17T13:38:20.8991327Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:38:20.9004528Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74013 2022-08-17T13:38:20.9010853Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74014 2022-08-17T13:38:22.3404920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:22.3405470Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:22.3407003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:22.3407483Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:22.3568697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:22.3569185Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:22.3573360Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:22.3573839Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:22.5063720Z dist init r=1, world=2 2022-08-17T13:38:22.5067640Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:38:22.5291998Z dist init r=0, world=2 2022-08-17T13:38:22.5296486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:38:22.5297340Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:22.5374462Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:23.9049126Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:38:23.9049659Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:38:24.4103616Z ok (3.513s) 2022-08-17T13:38:24.4121591Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:38:24.4134971Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74092 2022-08-17T13:38:24.4140997Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74093 2022-08-17T13:38:25.8560048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:25.8560558Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:25.8562765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:25.8563484Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:25.8895681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:25.8896144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:25.8900294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:25.8900771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:26.0237917Z dist init r=1, world=2 2022-08-17T13:38:26.0241819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:38:26.0612496Z dist init r=0, world=2 2022-08-17T13:38:26.0617213Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:38:26.0618074Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:26.0650231Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:27.4661727Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:38:27.4662264Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:38:27.5578517Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:38:27.5579840Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:38:28.0232159Z ok (3.613s) 2022-08-17T13:38:28.0251888Z test_fsdp_state_dict_keys_state_dict_type_local_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74171 2022-08-17T13:38:28.0257724Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74172 2022-08-17T13:38:29.5091028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:29.5091527Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:29.5094312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:29.5094832Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:29.5455407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:29.5455873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:29.5459631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:29.5460115Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:29.6831725Z dist init r=1, world=2 2022-08-17T13:38:29.6835855Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:38:29.7138956Z dist init r=0, world=2 2022-08-17T13:38:29.7143286Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:38:29.7144215Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:29.7245271Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:31.0954256Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:38:31.0954773Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:38:31.1167649Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:38:31.1168706Z warnings.warn( 2022-08-17T13:38:31.1170970Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:38:31.1171788Z warnings.warn( 2022-08-17T13:38:31.5346046Z ok (3.511s) 2022-08-17T13:38:31.5365351Z test_fsdp_state_dict_keys_state_dict_type_sharded_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74250 2022-08-17T13:38:31.5371353Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74251 2022-08-17T13:38:33.0037790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:33.0038310Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:33.0040188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:33.0040658Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:33.0224764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:33.0225235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:33.0229630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:33.0230112Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:33.1703815Z dist init r=0, world=2 2022-08-17T13:38:33.1707799Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:38:33.1944536Z dist init r=1, world=2 2022-08-17T13:38:33.1949495Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:38:33.1950508Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:33.2014742Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:34.5839604Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:38:34.5840138Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:38:34.6086836Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:38:34.6087721Z warnings.warn( 2022-08-17T13:38:34.6089119Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:38:34.6089879Z warnings.warn( 2022-08-17T13:38:35.0458145Z ok (3.511s) 2022-08-17T13:38:35.0478308Z test_fsdp_state_dict_keys_state_dict_type_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74329 2022-08-17T13:38:35.0484188Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74330 2022-08-17T13:38:36.4774776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:36.4775289Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:36.4777870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:36.4778359Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:36.4956419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:36.4956868Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:36.4961126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:36.4961609Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:36.6473089Z dist init r=0, world=2 2022-08-17T13:38:36.6477019Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:38:36.6688405Z dist init r=1, world=2 2022-08-17T13:38:36.6693136Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:38:36.6694179Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:36.6784476Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:38.0436099Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:38:38.0436615Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:38:38.0646847Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:38:38.0647864Z warnings.warn( 2022-08-17T13:38:38.0682990Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:38:38.0683882Z warnings.warn( 2022-08-17T13:38:38.4570906Z ok (3.411s) 2022-08-17T13:38:38.4578343Z test_fsdp_state_dict_with_activation_checkpoint_checkpoint_wrap_both (__main__.TestFSDPStateDict) 2022-08-17T13:38:38.4592458Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74408 2022-08-17T13:38:38.4597868Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74409 2022-08-17T13:38:39.8807875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:39.8808355Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:39.8810823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:39.8811308Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:39.9057453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:39.9058163Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:39.9062338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:39.9062830Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:40.0500762Z dist init r=1, world=2 2022-08-17T13:38:40.0504728Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:38:40.0781897Z dist init r=0, world=2 2022-08-17T13:38:40.0786984Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:38:40.0788180Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:40.0810895Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:41.4576151Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:38:41.4576694Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:38:41.8682912Z ok (3.411s) 2022-08-17T13:38:41.8690228Z test_fsdp_state_dict_with_activation_checkpoint_checkpoint_wrap_first (__main__.TestFSDPStateDict) 2022-08-17T13:38:41.8704206Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74487 2022-08-17T13:38:41.8710205Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74488 2022-08-17T13:38:43.3196152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:43.3196658Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:43.3197224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:43.3197699Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:43.3199692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:43.3200185Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:43.3200759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:43.3201227Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:43.4924767Z dist init r=1, world=2 2022-08-17T13:38:43.4928820Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:38:43.4983804Z dist init r=0, world=2 2022-08-17T13:38:43.4988855Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:38:43.4990307Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:43.5032080Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:44.8793261Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:38:44.8793780Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:38:45.3798940Z ok (3.511s) 2022-08-17T13:38:45.3805962Z test_fsdp_state_dict_with_activation_checkpoint_checkpoint_wrap_second (__main__.TestFSDPStateDict) 2022-08-17T13:38:45.3819744Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74566 2022-08-17T13:38:45.3826388Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74567 2022-08-17T13:38:46.8654367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:46.8655143Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:46.8657859Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:46.8658345Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:46.8749775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:46.8750213Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:46.8754514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:46.8754988Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:47.0396512Z dist init r=0, world=2 2022-08-17T13:38:47.0400490Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:38:47.0480355Z dist init r=1, world=2 2022-08-17T13:38:47.0485330Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:38:47.0486104Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:47.0503866Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:48.4294210Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:38:48.4294741Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:38:48.8914422Z ok (3.512s) 2022-08-17T13:38:48.8934493Z test_save_and_load_after_forward_state_dict_state_dict_type_local_state_dict_mixed_precision_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:38:48.8947888Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74645 2022-08-17T13:38:48.8954003Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74646 2022-08-17T13:38:50.3402547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:50.3403095Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:50.3405642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:50.3406132Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:50.3559901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:50.3560367Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:50.3564464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:50.3565169Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:50.5131834Z dist init r=1, world=2 2022-08-17T13:38:50.5135954Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:38:50.5233208Z dist init r=0, world=2 2022-08-17T13:38:50.5237791Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:38:50.5238628Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:50.5239344Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:51.9075809Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:38:51.9076751Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:38:51.9361518Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:38:51.9362099Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:38:51.9362788Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:38:51.9363338Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:38:52.8052741Z ok (3.914s) 2022-08-17T13:38:52.8074065Z test_save_and_load_after_forward_state_dict_state_dict_type_local_state_dict_mixed_precision_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:38:52.8088386Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74728 2022-08-17T13:38:52.8094281Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74729 2022-08-17T13:38:54.2593849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:54.2594360Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:54.2596602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:54.2597069Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:54.2806073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:54.2806553Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:54.2810888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:54.2811369Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:54.4264076Z dist init r=0, world=2 2022-08-17T13:38:54.4268458Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:38:54.4523133Z dist init r=1, world=2 2022-08-17T13:38:54.4527435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:38:54.4528666Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:54.4575227Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:55.8214447Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:38:55.8215047Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:38:56.2177735Z ok (3.412s) 2022-08-17T13:38:56.2198159Z test_save_and_load_after_forward_state_dict_state_dict_type_local_state_dict_mixed_precision_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:38:56.2210811Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74807 2022-08-17T13:38:56.2216880Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74808 2022-08-17T13:38:57.6551750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:57.6552456Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:57.6554521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:57.6555439Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:57.6795269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:38:57.6795901Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:38:57.6799664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:38:57.6800351Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:38:57.8232916Z dist init r=0, world=2 2022-08-17T13:38:57.8237067Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:38:57.8521707Z dist init r=1, world=2 2022-08-17T13:38:57.8526451Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:38:57.8527423Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:57.8543515Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:38:59.2140449Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:38:59.2141001Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:38:59.2452531Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:38:59.2453110Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:38:59.2453816Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:38:59.2454339Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:00.1312447Z ok (3.913s) 2022-08-17T13:39:00.1331840Z test_save_and_load_after_forward_state_dict_state_dict_type_local_state_dict_mixed_precision_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:39:00.1344775Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74890 2022-08-17T13:39:00.1351197Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74891 2022-08-17T13:39:01.6312087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:01.6312601Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:01.6314692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:01.6315185Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:01.6424001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:01.6424702Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:01.6429354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:01.6429836Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:01.7991085Z dist init r=1, world=2 2022-08-17T13:39:01.7994525Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:39:01.8158593Z dist init r=0, world=2 2022-08-17T13:39:01.8163492Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:39:01.8164491Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:01.8199490Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:03.1966751Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:39:03.1967278Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:39:03.6440097Z ok (3.513s) 2022-08-17T13:39:03.6459499Z test_save_and_load_after_forward_state_dict_state_dict_type_sharded_state_dict_mixed_precision_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:39:03.6473159Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74969 2022-08-17T13:39:03.6479131Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74970 2022-08-17T13:39:05.1444697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:05.1445208Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:05.1447745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:05.1448238Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:05.1660887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:05.1661351Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:05.1665914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:05.1666391Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:05.3123185Z dist init r=1, world=2 2022-08-17T13:39:05.3126759Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:39:05.3396525Z dist init r=0, world=2 2022-08-17T13:39:05.3401491Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:39:05.3402236Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:05.3433265Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:06.7089740Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:39:06.7090253Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:39:06.7330634Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:06.7331214Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:06.7331941Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:06.7332754Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:07.6577596Z ok (4.014s) 2022-08-17T13:39:07.6597687Z test_save_and_load_after_forward_state_dict_state_dict_type_sharded_state_dict_mixed_precision_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:39:07.6610772Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75052 2022-08-17T13:39:07.6616875Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75053 2022-08-17T13:39:09.1019514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:09.1020348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:09.1022583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:09.1023077Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:09.1320445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:09.1320926Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:09.1325021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:09.1325497Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:09.2690244Z dist init r=0, world=2 2022-08-17T13:39:09.2694029Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:39:09.3048425Z dist init r=1, world=2 2022-08-17T13:39:09.3052843Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:39:09.3053918Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:09.3102548Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:10.6788512Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:39:10.6789031Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:39:11.0704039Z ok (3.412s) 2022-08-17T13:39:11.0723154Z test_save_and_load_after_forward_state_dict_state_dict_type_sharded_state_dict_mixed_precision_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:39:11.0736105Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75131 2022-08-17T13:39:11.0742103Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75132 2022-08-17T13:39:12.5339920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:12.5340924Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:12.5342144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:12.5342652Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:12.5730730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:12.5731193Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:12.5735888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:12.5736377Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:12.7069847Z dist init r=0, world=2 2022-08-17T13:39:12.7074966Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:39:12.7436635Z dist init r=1, world=2 2022-08-17T13:39:12.7441761Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:39:12.7443073Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:12.7483992Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:14.1200853Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:39:14.1201863Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:39:14.1493390Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:14.1494462Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:14.1495727Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:14.1496738Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:15.0837592Z ok (4.013s) 2022-08-17T13:39:15.0857107Z test_save_and_load_after_forward_state_dict_state_dict_type_sharded_state_dict_mixed_precision_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:39:15.0870197Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75214 2022-08-17T13:39:15.0875949Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75215 2022-08-17T13:39:16.5446888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:16.5447390Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:16.5449467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:16.5449954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:16.5816412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:16.5816859Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:16.5821007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:16.5821501Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:16.7107775Z dist init r=1, world=2 2022-08-17T13:39:16.7111469Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:39:16.7507534Z dist init r=0, world=2 2022-08-17T13:39:16.7512281Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:39:16.7513015Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:16.7519542Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:18.0928068Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:39:18.0928601Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:39:18.4963113Z ok (3.412s) 2022-08-17T13:39:18.4981743Z test_save_and_load_after_forward_state_dict_state_dict_type_state_dict_mixed_precision_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:39:18.4995316Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75293 2022-08-17T13:39:18.5001484Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75294 2022-08-17T13:39:19.9296878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:19.9297384Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:19.9299563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:19.9300057Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:19.9798260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:19.9798730Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:19.9802689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:19.9803172Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:20.0958365Z dist init r=0, world=2 2022-08-17T13:39:20.0962470Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:39:20.1460533Z dist init r=1, world=2 2022-08-17T13:39:20.1464965Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:39:20.1465986Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:20.1472201Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:21.5188602Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:39:21.5189127Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:39:21.5449353Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:21.5449906Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:21.5450606Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:21.5451168Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:22.4108185Z ok (3.914s) 2022-08-17T13:39:22.4127187Z test_save_and_load_after_forward_state_dict_state_dict_type_state_dict_mixed_precision_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:39:22.4140551Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75376 2022-08-17T13:39:22.4146513Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75377 2022-08-17T13:39:23.8408699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:23.8409226Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:23.8411020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:23.8411511Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:23.9066808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:23.9067274Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:23.9071490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:23.9071999Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:24.0064982Z dist init r=1, world=2 2022-08-17T13:39:24.0069240Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:39:24.0759683Z dist init r=0, world=2 2022-08-17T13:39:24.0764196Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:39:24.0765273Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:24.0782043Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:25.4512303Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:39:25.4512843Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:39:25.4800337Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:25.4800885Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:25.4802366Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:25.4802924Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:26.4255571Z ok (4.015s) 2022-08-17T13:39:26.4275626Z test_save_and_load_after_forward_state_dict_state_dict_type_state_dict_mixed_precision_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-08-17T13:39:26.4288673Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75459 2022-08-17T13:39:26.4294611Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75460 2022-08-17T13:39:27.9036201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:27.9036777Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:27.9039039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:27.9039541Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:27.9430843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:27.9431342Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:27.9435522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:27.9436023Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:28.0700874Z dist init r=0, world=2 2022-08-17T13:39:28.0704980Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:39:28.1158367Z dist init r=1, world=2 2022-08-17T13:39:28.1162691Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:39:28.1163480Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:28.1214949Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:29.4871582Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:39:29.4872111Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:39:29.5134782Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:29.5135445Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:29.5136163Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:29.5136946Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:30.4403219Z ok (4.015s) 2022-08-17T13:39:30.4422864Z test_save_and_load_after_forward_state_dict_state_dict_type_state_dict_mixed_precision_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-08-17T13:39:30.4435474Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75542 2022-08-17T13:39:30.4441450Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75543 2022-08-17T13:39:31.8879122Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:31.8879643Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:31.8882078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:31.8882573Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:31.9055642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:31.9056142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:31.9060444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:31.9060940Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:32.0572858Z dist init r=1, world=2 2022-08-17T13:39:32.0576804Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:39:32.0786332Z dist init r=0, world=2 2022-08-17T13:39:32.0791287Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:39:32.0792231Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:32.0883727Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:33.4553824Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:39:33.4554452Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:39:33.4843362Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:33.4844170Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:33.4845419Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:33.4845978Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:34.3540570Z ok (3.914s) 2022-08-17T13:39:34.3563751Z test_state_dict_load_into_local_module_state_dict_type_sharded_state_dict_state_dict_rank0_and_offload_False_fsdp_root_False (__main__.TestFSDPStateDict) 2022-08-17T13:39:34.3578049Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75625 2022-08-17T13:39:34.3583821Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75626 2022-08-17T13:39:35.7928587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:35.7929544Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:35.7930724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:35.7931665Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:35.8269352Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:35.8270296Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:35.8275353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:35.8276291Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:35.9584030Z dist init r=1, world=2 2022-08-17T13:39:35.9588722Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:39:36.0018274Z dist init r=0, world=2 2022-08-17T13:39:36.0023632Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:39:36.0025326Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:36.0099012Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:37.3692566Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:39:37.3693616Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:39:37.8131937Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:37.8133040Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:37.8231901Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:37.8232924Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:38.3682433Z ok (4.014s) 2022-08-17T13:39:38.3704815Z test_state_dict_load_into_local_module_state_dict_type_sharded_state_dict_state_dict_rank0_and_offload_False_fsdp_root_True (__main__.TestFSDPStateDict) 2022-08-17T13:39:38.3718179Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75708 2022-08-17T13:39:38.3723939Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75709 2022-08-17T13:39:39.8061623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:39.8062625Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:39.8064121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:39.8065089Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:39.8326812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:39.8327768Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:39.8333050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:39.8334023Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:39.9740914Z dist init r=1, world=2 2022-08-17T13:39:39.9745033Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:39:40.0062423Z dist init r=0, world=2 2022-08-17T13:39:40.0068166Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:39:40.0069551Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:40.0153573Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:41.3639171Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:39:41.3639678Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:39:41.3848955Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:39:41.3849741Z warnings.warn( 2022-08-17T13:39:41.3850859Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:39:41.3851613Z warnings.warn( 2022-08-17T13:39:41.3881470Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:41.3882019Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:41.3882718Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:41.3883266Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:42.2819136Z ok (3.913s) 2022-08-17T13:39:42.2841498Z test_state_dict_load_into_local_module_state_dict_type_sharded_state_dict_state_dict_rank0_and_offload_True_fsdp_root_False (__main__.TestFSDPStateDict) 2022-08-17T13:39:42.2855137Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75791 2022-08-17T13:39:42.2861249Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75792 2022-08-17T13:39:43.6905173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:43.6905685Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:43.6908073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:43.6908563Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:43.7460069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:43.7460535Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:43.7464996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:43.7465497Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:43.8572277Z dist init r=0, world=2 2022-08-17T13:39:43.8576279Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:39:43.9178198Z dist init r=1, world=2 2022-08-17T13:39:43.9182795Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:39:43.9183853Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:43.9187226Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:45.2949284Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:39:45.2950114Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:39:45.6944572Z ok (3.412s) 2022-08-17T13:39:45.6966328Z test_state_dict_load_into_local_module_state_dict_type_sharded_state_dict_state_dict_rank0_and_offload_True_fsdp_root_True (__main__.TestFSDPStateDict) 2022-08-17T13:39:45.6979696Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75870 2022-08-17T13:39:45.6985665Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75871 2022-08-17T13:39:47.1786146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:47.1787092Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:47.1788650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:47.1789634Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:47.1929201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:47.1930173Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:47.1935230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:47.1936212Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:47.3458487Z dist init r=0, world=2 2022-08-17T13:39:47.3462571Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:39:47.3660782Z dist init r=1, world=2 2022-08-17T13:39:47.3666148Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:39:47.3667500Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:47.3668785Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:48.7405855Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:39:48.7406735Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:39:49.1070844Z ok (3.413s) 2022-08-17T13:39:49.1092093Z test_state_dict_load_into_local_module_state_dict_type_state_dict_state_dict_rank0_and_offload_False_fsdp_root_False (__main__.TestFSDPStateDict) 2022-08-17T13:39:49.1106051Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75949 2022-08-17T13:39:49.1114229Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75950 2022-08-17T13:39:50.5423820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:50.5424774Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:50.5426891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:50.5428122Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:50.5576520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:50.5577430Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:50.5580953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:50.5581920Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:50.7184623Z dist init r=0, world=2 2022-08-17T13:39:50.7188775Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:39:50.7271764Z dist init r=1, world=2 2022-08-17T13:39:50.7276429Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:39:50.7277568Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:50.7291767Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:52.0919548Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:39:52.0920084Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:39:52.5184261Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:52.5184886Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:52.5239180Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:52.5239783Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:53.0217486Z ok (3.915s) 2022-08-17T13:39:53.0239656Z test_state_dict_load_into_local_module_state_dict_type_state_dict_state_dict_rank0_and_offload_False_fsdp_root_True (__main__.TestFSDPStateDict) 2022-08-17T13:39:53.0253325Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76032 2022-08-17T13:39:53.0259291Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76033 2022-08-17T13:39:54.4664880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:54.4665426Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:54.4667843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:54.4668549Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:54.4940177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:54.4940933Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:54.4945035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:54.4945781Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:54.6325197Z dist init r=0, world=2 2022-08-17T13:39:54.6329042Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:39:54.6668736Z dist init r=1, world=2 2022-08-17T13:39:54.6673207Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:39:54.6674284Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:54.6737800Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:56.0330274Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:39:56.0331287Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:39:56.0530920Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:39:56.0532920Z warnings.warn( 2022-08-17T13:39:56.0535230Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:39:56.0536782Z warnings.warn( 2022-08-17T13:39:56.0562406Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:56.0563571Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:56.0565028Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:39:56.0566165Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:39:56.9355866Z ok (3.914s) 2022-08-17T13:39:56.9377601Z test_state_dict_load_into_local_module_state_dict_type_state_dict_state_dict_rank0_and_offload_True_fsdp_root_False (__main__.TestFSDPStateDict) 2022-08-17T13:39:56.9391255Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76115 2022-08-17T13:39:56.9397065Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76116 2022-08-17T13:39:58.4186756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:58.4187252Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:58.4190198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:58.4190707Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:58.4449273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:39:58.4449726Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:39:58.4453722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:39:58.4454210Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:39:58.5888346Z dist init r=1, world=2 2022-08-17T13:39:58.5892380Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:39:58.6164118Z dist init r=0, world=2 2022-08-17T13:39:58.6168786Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:39:58.6169710Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:58.6199159Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:39:59.9853795Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:39:59.9854359Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:40:00.4290664Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:40:00.4291245Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:40:00.4404166Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:40:00.4404731Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:40:00.9495205Z ok (4.014s) 2022-08-17T13:40:00.9517445Z test_state_dict_load_into_local_module_state_dict_type_state_dict_state_dict_rank0_and_offload_True_fsdp_root_True (__main__.TestFSDPStateDict) 2022-08-17T13:40:00.9530732Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76198 2022-08-17T13:40:00.9536359Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76199 2022-08-17T13:40:02.3923066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:02.3923556Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:02.3926758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:02.3927584Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:02.4287979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:02.4288448Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:02.4292098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:02.4292581Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:02.5614735Z dist init r=0, world=2 2022-08-17T13:40:02.5618525Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:40:02.6037114Z dist init r=1, world=2 2022-08-17T13:40:02.6042075Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:40:02.6042910Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:02.6128436Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:03.9722332Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:40:03.9722895Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:40:03.9930976Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:03.9931770Z warnings.warn( 2022-08-17T13:40:03.9933198Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:03.9934046Z warnings.warn( 2022-08-17T13:40:03.9963686Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:40:03.9964259Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:40:03.9964961Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:40:03.9965505Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:40:04.8634735Z ok (3.914s) 2022-08-17T13:40:04.8650048Z test_state_dict_rank0_offload_save_load_flow (__main__.TestFSDPStateDict) 2022-08-17T13:40:04.8662982Z Tests saving a model checkpoint only on rank 0 and loading it only ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76281 2022-08-17T13:40:04.8669103Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76282 2022-08-17T13:40:06.2996913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:06.2997435Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:06.2998945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:06.2999433Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:06.3246827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:06.3247573Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:06.3251204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:06.3251922Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:06.4678464Z dist init r=0, world=2 2022-08-17T13:40:06.4682222Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:40:06.4914725Z dist init r=1, world=2 2022-08-17T13:40:06.4918959Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:40:06.4920059Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:06.4989006Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:07.8395930Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:40:07.8396467Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:40:08.4758922Z ok (3.612s) 2022-08-17T13:40:08.4775900Z test_state_dict_save_load_flow_state_dict_type_local_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76360 2022-08-17T13:40:08.4782069Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76361 2022-08-17T13:40:09.9342628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:09.9343174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:09.9346128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:09.9346634Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:09.9764576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:09.9765285Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:09.9777294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:09.9777780Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:10.1055661Z dist init r=0, world=2 2022-08-17T13:40:10.1059717Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:40:10.1504579Z dist init r=1, world=2 2022-08-17T13:40:10.1509530Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:40:10.1510326Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:10.1569497Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:11.5173232Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:40:11.5173754Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:40:11.5368880Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:11.5369672Z warnings.warn( 2022-08-17T13:40:11.5370790Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:11.5371552Z warnings.warn( 2022-08-17T13:40:11.5399340Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:40:11.5399908Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:40:11.5400929Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:40:11.5401481Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:40:11.9777594Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:40:11.9778126Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:40:12.3876081Z ok (3.912s) 2022-08-17T13:40:12.3892595Z test_state_dict_save_load_flow_state_dict_type_sharded_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76443 2022-08-17T13:40:12.3898677Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76444 2022-08-17T13:40:13.8914861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:13.8915413Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:13.8917725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:13.8918231Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:13.9245173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:13.9245849Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:13.9249675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:13.9250158Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:14.0597544Z dist init r=1, world=2 2022-08-17T13:40:14.0601509Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:40:14.0980950Z dist init r=0, world=2 2022-08-17T13:40:14.0985718Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:40:14.0986883Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:14.1008714Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:15.5008028Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:40:15.5008597Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:40:15.5248436Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:15.5249222Z warnings.warn( 2022-08-17T13:40:15.5250336Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:15.5251071Z warnings.warn( 2022-08-17T13:40:15.5280336Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:40:15.5280907Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:40:15.5281598Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:40:15.5282144Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:40:15.9732760Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:40:15.9733321Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:40:16.3996315Z ok (4.012s) 2022-08-17T13:40:16.4013371Z test_state_dict_save_load_flow_state_dict_type_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76526 2022-08-17T13:40:16.4019229Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76527 2022-08-17T13:40:17.8400377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:17.8400886Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:17.8403217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:17.8403730Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:17.8622070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:17.8622763Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:17.8626940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:17.8627410Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:18.0061864Z dist init r=0, world=2 2022-08-17T13:40:18.0065590Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:40:18.0340343Z dist init r=1, world=2 2022-08-17T13:40:18.0345018Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:40:18.0346324Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:18.0372171Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:19.3943678Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:40:19.3944490Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:40:19.4209202Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:19.4210011Z warnings.warn( 2022-08-17T13:40:19.4211176Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:19.4211920Z warnings.warn( 2022-08-17T13:40:19.4239862Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:40:19.4240410Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:40:19.4241251Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:40:19.4241797Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:40:19.8782183Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:40:19.8782741Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:40:20.3115702Z ok (3.912s) 2022-08-17T13:40:20.3151951Z test_state_dict_skip_module_state_dict_type_local_state_dict_double_nest_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76609 2022-08-17T13:40:20.3158035Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76610 2022-08-17T13:40:21.7442890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:21.7443411Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:21.7445783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:21.7446292Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:21.7744696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:21.7745397Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:21.7748394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:21.7748881Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:21.9122678Z dist init r=1, world=2 2022-08-17T13:40:21.9126739Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:40:21.9464053Z dist init r=0, world=2 2022-08-17T13:40:21.9469476Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:40:21.9470502Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:21.9535223Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:23.3296492Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:40:23.3297025Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:40:23.3517739Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:40:23.3518329Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:40:23.3519036Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:40:23.3519577Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:40:24.2255676Z ok (3.914s) 2022-08-17T13:40:24.2291402Z test_state_dict_skip_module_state_dict_type_sharded_state_dict_double_nest_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76692 2022-08-17T13:40:24.2297184Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76693 2022-08-17T13:40:25.6767690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:25.6768237Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:25.6770978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:25.6771466Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:25.6898869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:25.6899369Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:25.6903511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:25.6903999Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:25.8445730Z dist init r=0, world=2 2022-08-17T13:40:25.8449977Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:40:25.8592093Z dist init r=1, world=2 2022-08-17T13:40:25.8596851Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:40:25.8597576Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:25.8655095Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:27.2199073Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:40:27.2199904Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:40:27.2477468Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:40:27.2478054Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:40:27.2478753Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:40:27.2479302Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:40:28.1395379Z ok (3.914s) 2022-08-17T13:40:28.1432149Z test_state_dict_skip_module_state_dict_type_state_dict_double_nest_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76775 2022-08-17T13:40:28.1437700Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76776 2022-08-17T13:40:29.5748570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:29.5749067Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:29.5751478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:29.5751965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:29.6053347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:29.6053808Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:29.6057964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:29.6058448Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:29.7404023Z dist init r=0, world=2 2022-08-17T13:40:29.7408011Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:40:29.7785516Z dist init r=1, world=2 2022-08-17T13:40:29.7790704Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:40:29.7791966Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:29.7816287Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:31.1403637Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:40:31.1404162Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:40:31.1636710Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:40:31.1637285Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:40:31.1671247Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:40:31.1671812Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:40:32.0533152Z ok (3.914s) 2022-08-17T13:40:32.0550916Z test_state_dict_type (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76858 2022-08-17T13:40:32.0556978Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76859 2022-08-17T13:40:33.5519990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:33.5520550Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:33.5522326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:33.5522832Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:33.5570131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:33.5570601Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:33.5574384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:33.5574869Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:33.7238504Z dist init r=0, world=2 2022-08-17T13:40:33.7241890Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:40:33.7271185Z dist init r=1, world=2 2022-08-17T13:40:33.7275865Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:40:33.7276875Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:33.7345822Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:35.0974145Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:40:35.0974681Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:40:35.5644608Z ok (3.511s) 2022-08-17T13:40:35.5677864Z test_state_dict_with_ignored_modules_prefix_False_ignore_inner_False (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76937 2022-08-17T13:40:35.5683677Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76938 2022-08-17T13:40:37.0105139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:37.0105663Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:37.0108719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:37.0109187Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:37.0748286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:37.0748762Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:37.0752697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:37.0753172Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:37.1781591Z dist init r=1, world=2 2022-08-17T13:40:37.1785354Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:40:37.2490181Z dist init r=0, world=2 2022-08-17T13:40:37.2494503Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:40:37.2495666Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:37.2499118Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:38.6249020Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:40:38.6249545Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:40:38.6449614Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:38.6450489Z warnings.warn( 2022-08-17T13:40:38.6484478Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:38.6485233Z warnings.warn( 2022-08-17T13:40:39.0772873Z ok (3.513s) 2022-08-17T13:40:39.0806517Z test_state_dict_with_ignored_modules_prefix_False_ignore_inner_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77016 2022-08-17T13:40:39.0812183Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77017 2022-08-17T13:40:40.5063100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:40.5063842Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:40.5066309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:40.5066814Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:40.5339927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:40.5340388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:40.5344856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:40.5345353Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:40.6728736Z dist init r=0, world=2 2022-08-17T13:40:40.6732735Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:40:40.7066703Z dist init r=1, world=2 2022-08-17T13:40:40.7071067Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:40:40.7072124Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:40.7141154Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:42.0718554Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:40:42.0719577Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:40:42.0926901Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1152: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not supported: Linear(in_features=4, out_features=4, bias=True) 2022-08-17T13:40:42.0928153Z warnings.warn( 2022-08-17T13:40:42.0929936Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1152: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not supported: Linear(in_features=4, out_features=4, bias=True) 2022-08-17T13:40:42.0931199Z warnings.warn( 2022-08-17T13:40:42.0933939Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:42.0935628Z warnings.warn( 2022-08-17T13:40:42.0937832Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:42.0939303Z warnings.warn( 2022-08-17T13:40:42.4898730Z ok (3.412s) 2022-08-17T13:40:42.4932463Z test_state_dict_with_ignored_modules_prefix_True_ignore_inner_False (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77095 2022-08-17T13:40:42.4938295Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77096 2022-08-17T13:40:43.9462742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:43.9463969Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:43.9465768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:43.9466704Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:43.9840218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:43.9841145Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:43.9844963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:43.9845923Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:44.1149727Z dist init r=1, world=2 2022-08-17T13:40:44.1154188Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:40:44.1555098Z dist init r=0, world=2 2022-08-17T13:40:44.1560060Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:40:44.1561416Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:44.1562738Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:45.5386120Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:40:45.5387131Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:40:45.5612386Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:45.5613894Z warnings.warn( 2022-08-17T13:40:45.5616114Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:45.5617581Z warnings.warn( 2022-08-17T13:40:46.0026004Z ok (3.513s) 2022-08-17T13:40:46.0059835Z test_state_dict_with_ignored_modules_prefix_True_ignore_inner_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77174 2022-08-17T13:40:46.0066037Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77175 2022-08-17T13:40:47.3790817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:47.3791326Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:47.3793689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:47.3794188Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:47.4517124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:47.4517898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:47.4520689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:47.4521163Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:47.5470683Z dist init r=1, world=2 2022-08-17T13:40:47.5474377Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:40:47.6242058Z dist init r=0, world=2 2022-08-17T13:40:47.6246628Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:40:47.6247667Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:47.6289246Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:49.0024876Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:40:49.0025460Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:40:49.0246895Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1152: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not supported: Linear(in_features=4, out_features=4, bias=True) 2022-08-17T13:40:49.0248189Z warnings.warn( 2022-08-17T13:40:49.0250010Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1152: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not supported: Linear(in_features=4, out_features=4, bias=True) 2022-08-17T13:40:49.0251329Z warnings.warn( 2022-08-17T13:40:49.0253706Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:49.0255314Z warnings.warn( 2022-08-17T13:40:49.0257587Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:49.0259044Z warnings.warn( 2022-08-17T13:40:49.4161907Z ok (3.413s) 2022-08-17T13:40:49.4178878Z test_wrong_state_dict_config (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77253 2022-08-17T13:40:49.4185207Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77254 2022-08-17T13:40:50.8543052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:50.8543988Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:50.8545567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:50.8546048Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:50.8660170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:50.8660866Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:50.8665212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:50.8665707Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:51.0208821Z dist init r=1, world=2 2022-08-17T13:40:51.0212416Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:40:51.0388825Z dist init r=0, world=2 2022-08-17T13:40:51.0393504Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:40:51.0394309Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:51.0417191Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:52.4135214Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:40:52.4135763Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:40:52.4328457Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:52.4329245Z warnings.warn( 2022-08-17T13:40:52.4363464Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:40:52.4364218Z warnings.warn( 2022-08-17T13:40:52.9272654Z ok (3.511s) 2022-08-17T13:40:52.9272892Z 2022-08-17T13:40:52.9273292Z ---------------------------------------------------------------------- 2022-08-17T13:40:52.9273619Z Ran 63 tests in 230.848s 2022-08-17T13:40:52.9273794Z 2022-08-17T13:40:52.9273890Z OK 2022-08-17T13:40:52.9274028Z 2022-08-17T13:40:52.9274159Z Generating XML reports... 2022-08-17T13:40:52.9387703Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_state_dict/TEST-TestFSDPStateDict-20220817133702.xml 2022-08-17T13:40:53.2794694Z Running distributed/optim/test_zero_redundancy_optimizer ... [2022-08-17 13:40:53.278872] 2022-08-17T13:40:53.2795547Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/optim/test_zero_redundancy_optimizer.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:40:53.278942] 2022-08-17T13:40:55.1776897Z Test results will be stored in test-reports/python-unittest/distributed.optim.test_zero_redundancy_optimizer 2022-08-17T13:40:55.2121695Z 2022-08-17T13:40:55.2122307Z Running tests... 2022-08-17T13:40:55.2123185Z ---------------------------------------------------------------------- 2022-08-17T13:40:55.2139131Z test_add_param_group (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:40:56.6924906Z Check that ZeroRedundancyOptimizer properly handles adding a new ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:40:56.7083696Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/67287 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.496s) 2022-08-17T13:40:56.7098989Z test_collect_shards (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:40:56.7131081Z Check the state consolidation mechanism and the state dict exposed ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77367 2022-08-17T13:40:56.7137855Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77368 2022-08-17T13:40:58.3997938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:58.3998436Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:58.3999024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:58.3999496Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:58.4102293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:40:58.4102759Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:40:58.4105741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:40:58.4106213Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:40:58.5644356Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:40:58.5647155Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:40:58.5796514Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:40:58.5800343Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:40:58.5801185Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:40:58.5852178Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:01.9266652Z ok (5.218s) 2022-08-17T13:41:01.9277536Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_False_gradient_as_bucket_view_False_static_graph_False_shard_buckets_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:41:01.9290420Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77451 2022-08-17T13:41:01.9296460Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77452 2022-08-17T13:41:03.6238616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:03.6239570Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:03.6240753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:03.6241658Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:03.6292486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:03.6293768Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:03.6295850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:03.6296780Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:03.7901366Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:41:03.7904819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:41:03.7996267Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:41:03.8000446Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:41:03.8002283Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:03.8008161Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:05.2740165Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:05.2744219Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:05.2911324Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:05.2911931Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:05.2912659Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:05.2913209Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:05.7184532Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:05.7185096Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:05.7600871Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:05.7603642Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:06.2404624Z ok (4.314s) 2022-08-17T13:41:06.2414783Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_False_gradient_as_bucket_view_False_static_graph_False_shard_buckets_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:41:06.2428018Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77564 2022-08-17T13:41:06.2434032Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77565 2022-08-17T13:41:07.9151315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:07.9151824Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:07.9152769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:07.9153243Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:07.9665704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:07.9666187Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:07.9669347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:07.9670091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:08.0798149Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:41:08.0801247Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:41:08.1367535Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:41:08.1371504Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:41:08.1372603Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:08.1412179Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:09.5899628Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:09.5903240Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:09.6061559Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:09.6062264Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:09.6062980Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:09.6063838Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:10.0338026Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:10.0338557Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:10.0751548Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:10.0752044Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:10.5539266Z ok (4.313s) 2022-08-17T13:41:10.5549828Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_False_gradient_as_bucket_view_False_static_graph_True_shard_buckets_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:41:10.5562903Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77677 2022-08-17T13:41:10.5569219Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77678 2022-08-17T13:41:12.1906075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:12.1906556Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:12.1907886Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:12.1908363Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:12.2290470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:12.2290912Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:12.2293423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:12.2293909Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:12.3538922Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:41:12.3542196Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:41:12.3985261Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:41:12.3989114Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:41:12.3990039Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:12.4052504Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:13.8370375Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:13.8374975Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:13.8545178Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:13.8545756Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:13.8546465Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:13.8547013Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:14.2841948Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:14.2857218Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:14.3404081Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:14.3404593Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:14.8674098Z ok (4.313s) 2022-08-17T13:41:14.8684789Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_False_gradient_as_bucket_view_False_static_graph_True_shard_buckets_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:41:14.8698549Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77790 2022-08-17T13:41:14.8704898Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77791 2022-08-17T13:41:16.5485439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:16.5485951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:16.5486995Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:16.5487458Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:16.5723382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:16.5723839Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:16.5726121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:16.5726582Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:16.7128432Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:41:16.7131119Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:41:16.7435220Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:41:16.7438490Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:41:16.7439693Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:16.7539990Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:18.1941220Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:18.1945598Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:18.2112936Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:18.2113572Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:18.2114265Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:18.2114803Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:18.6452549Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:18.6453150Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:18.6928316Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:18.6928819Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:19.1809404Z ok (4.313s) 2022-08-17T13:41:19.1819488Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_False_gradient_as_bucket_view_True_static_graph_False_shard_buckets_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:41:19.1833286Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77903 2022-08-17T13:41:19.1839637Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77904 2022-08-17T13:41:20.8399775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:20.8400280Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:20.8401344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:20.8401817Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:20.8760005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:20.8760476Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:20.8762960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:20.8763441Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:21.0255134Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:41:21.0258246Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:41:21.0449990Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:41:21.0453451Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:41:21.0454533Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:21.0462817Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:22.4977715Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:22.4980036Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:22.5146716Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:22.5147316Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:22.5148018Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:22.5148544Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:22.9414346Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:22.9414898Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:22.9786514Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:22.9787012Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:23.4946810Z ok (4.314s) 2022-08-17T13:41:23.4959067Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_False_gradient_as_bucket_view_True_static_graph_False_shard_buckets_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:41:23.4972296Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78016 2022-08-17T13:41:23.4979063Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78017 2022-08-17T13:41:25.1570643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:25.1571509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:25.1572110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:25.1572581Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:25.1823136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:25.1824063Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:25.1825674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:25.1826491Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:25.3241487Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:41:25.3244758Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:41:25.3518190Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:41:25.3521885Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:41:25.3523310Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:25.3551330Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:26.7936483Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:26.7939900Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:26.8093785Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:26.8094705Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:26.8095430Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:26.8095966Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:27.2330241Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:27.2330849Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:27.2719064Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:27.2719567Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:27.7080943Z ok (4.213s) 2022-08-17T13:41:27.7091295Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_False_gradient_as_bucket_view_True_static_graph_True_shard_buckets_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:41:27.7105015Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78129 2022-08-17T13:41:27.7111938Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78130 2022-08-17T13:41:29.3684375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:29.3684887Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:29.3685463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:29.3685937Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:29.3826080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:29.3826561Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:29.3829368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:29.3829856Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:29.5316043Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:41:29.5321366Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:41:29.5476071Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:41:29.5479130Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:41:29.5479900Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:29.5524630Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:30.9908531Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:30.9913068Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:31.0064020Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:31.0064645Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:31.0065359Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:31.0066330Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:31.4093156Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:31.4093684Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:31.4538592Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:31.4539138Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:31.9215802Z ok (4.213s) 2022-08-17T13:41:31.9227700Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_False_gradient_as_bucket_view_True_static_graph_True_shard_buckets_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:41:31.9242302Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78242 2022-08-17T13:41:31.9470211Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78243 2022-08-17T13:41:33.6173377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:33.6174330Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:33.6175525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:33.6176430Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:33.6574305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:33.6575264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:33.6576969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:33.6577941Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:33.8026210Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:41:33.8029367Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:41:33.8285817Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:41:33.8289516Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:41:33.8290298Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:33.8337041Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:35.2770354Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:35.2773960Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:35.2927997Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:35.2928585Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:35.2929290Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:35.2929841Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:35.7045758Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:35.7046280Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:35.7593926Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:35.7594442Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:36.2574430Z ok (4.336s) 2022-08-17T13:41:36.2586560Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_True_gradient_as_bucket_view_False_static_graph_False_shard_buckets_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:41:36.2600593Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78355 2022-08-17T13:41:36.2607623Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78356 2022-08-17T13:41:37.8806919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:37.8807424Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:37.8808045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:37.8808525Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:37.9403822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:37.9404298Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:37.9406967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:37.9407446Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:38.0457590Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:41:38.0460932Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:41:38.1109865Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:41:38.1113519Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:41:38.1114929Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:38.1173995Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:39.5581241Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:39.5583753Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:39.5752041Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:39.5752662Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:39.5753359Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:39.5753901Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:39.9740296Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:39.9740840Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:40.0175522Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:40.0176035Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:40.4711310Z ok (4.214s) 2022-08-17T13:41:40.4722223Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_True_gradient_as_bucket_view_False_static_graph_False_shard_buckets_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:41:40.4736682Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78468 2022-08-17T13:41:40.4743740Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78469 2022-08-17T13:41:42.1608406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:42.1609157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:42.1609793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:42.1610278Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:42.1844980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:42.1845428Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:42.1848156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:42.1848632Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:42.3269604Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:41:42.3272555Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:41:42.3564624Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:41:42.3568275Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:41:42.3569400Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:42.3578725Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:43.8214660Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:43.8219386Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:43.8368305Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:43.8369158Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:43.8369882Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:43.8370424Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:44.2349486Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:44.2350065Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:44.2793156Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:44.2793678Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:44.7849462Z ok (4.314s) 2022-08-17T13:41:44.7859666Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_True_gradient_as_bucket_view_False_static_graph_True_shard_buckets_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:41:44.7874051Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78581 2022-08-17T13:41:44.7881198Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78582 2022-08-17T13:41:46.4439011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:46.4439529Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:46.4440134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:46.4440636Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:46.4524568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:46.4525029Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:46.4527438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:46.4527922Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:46.6214757Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:41:46.6217862Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:41:46.6237704Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:41:46.6241516Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:41:46.6242334Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:46.6321498Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:48.0457116Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:48.0459216Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:48.0614070Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:48.0614708Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:48.0615694Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:48.0616263Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:48.4702022Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:48.4702544Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:48.5185734Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:48.5186235Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:48.9986621Z ok (4.214s) 2022-08-17T13:41:48.9997249Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_True_gradient_as_bucket_view_False_static_graph_True_shard_buckets_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:41:49.0011881Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78694 2022-08-17T13:41:49.0019148Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78695 2022-08-17T13:41:50.6805003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:50.6805521Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:50.6806121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:50.6806604Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:50.6879204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:50.6879688Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:50.6882096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:50.6882585Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:50.8473225Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:41:50.8475936Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:41:50.8602205Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:41:50.8605942Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:41:50.8607011Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:50.8680859Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:52.3240248Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:52.3242386Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:52.3425178Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:52.3425747Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:52.3426447Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:52.3427015Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:52.7543677Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:52.7544455Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:52.8047367Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:52.8050945Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:53.3128277Z ok (4.314s) 2022-08-17T13:41:53.3139101Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_True_gradient_as_bucket_view_True_static_graph_False_shard_buckets_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:41:53.3154471Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78807 2022-08-17T13:41:53.3162420Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78808 2022-08-17T13:41:54.9914161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:54.9914707Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:54.9915728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:54.9916220Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:55.0106637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:55.0107134Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:55.0109498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:55.0110007Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:55.1588258Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:41:55.1591318Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:41:55.1833109Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:41:55.1836529Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:41:55.1837320Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:55.1897969Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:56.6284118Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:56.6286246Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:41:56.6465789Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:56.6466383Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:56.6467267Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:41:56.6467837Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:41:57.0482991Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:57.0483832Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:57.0900341Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:57.0900833Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:41:57.6270669Z ok (4.314s) 2022-08-17T13:41:57.6281131Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_True_gradient_as_bucket_view_True_static_graph_False_shard_buckets_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:41:57.6295209Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78920 2022-08-17T13:41:57.6302304Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78921 2022-08-17T13:41:59.2257038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:59.2257534Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:59.2258621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:59.2259086Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:59.2608697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:41:59.2609159Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:41:59.2611454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:41:59.2611923Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:41:59.3927118Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:41:59.3929573Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:41:59.4310841Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:41:59.4314946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:41:59.4316050Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:41:59.4337218Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:00.8919431Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:42:00.8922003Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:42:00.9079160Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:00.9079752Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:00.9080439Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:00.9080986Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:01.3146160Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:42:01.3146661Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:42:01.3570308Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:42:01.3571041Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:42:01.8405039Z ok (4.213s) 2022-08-17T13:42:01.8417693Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_True_gradient_as_bucket_view_True_static_graph_True_shard_buckets_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:42:01.8432865Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79033 2022-08-17T13:42:01.8440468Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79034 2022-08-17T13:42:03.5803428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:03.5804182Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:03.5804788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:03.5805272Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:03.5933241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:03.5933733Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:03.5934322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:03.5934797Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:03.7497017Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:42:03.7499148Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:42:03.7670509Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:42:03.7674399Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:42:03.7675208Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:03.7704707Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:05.2628470Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:42:05.2630718Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:42:05.2809751Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:05.2810366Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:05.2811072Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:05.2811619Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:05.7003826Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:42:05.7004354Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:42:05.7486636Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:42:05.7487192Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:42:06.2554556Z ok (4.415s) 2022-08-17T13:42:06.2566420Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_True_gradient_as_bucket_view_True_static_graph_True_shard_buckets_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:42:06.2582618Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79146 2022-08-17T13:42:06.2590710Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79147 2022-08-17T13:42:07.9147152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:07.9148069Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:07.9149170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:07.9150031Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:07.9399107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:07.9399610Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:07.9401593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:07.9402057Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:08.1048431Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:42:08.1051089Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:42:08.1138887Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:42:08.1142719Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:42:08.1144237Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:08.1154748Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:09.5972207Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:42:09.5975268Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-08-17T13:42:09.6152174Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:09.6152794Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:09.6153492Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:09.6154055Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:10.0300821Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:42:10.0301347Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:42:10.0881468Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:42:10.0882009Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:42:10.5699315Z ok (4.314s) 2022-08-17T13:42:10.5731200Z test_local_optimizer_parity_optimizer_class_str_AdamW_maximize_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:42:10.5746278Z When combined with DDP, check that a local optimizer gives the same ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79259 2022-08-17T13:42:10.5754022Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79260 2022-08-17T13:42:12.2460370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:12.2460904Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:12.2461693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:12.2462386Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:12.2650182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:12.2650990Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:12.2652635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:12.2653165Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:12.4363465Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:42:12.4365531Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:42:12.4408713Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:42:12.4411944Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:42:12.4412734Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:12.4469147Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:13.7932330Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx4jp1grm 2022-08-17T13:42:13.7933000Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx4jp1grm/_remote_module_non_scriptable.py 2022-08-17T13:42:13.7960183Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxzm9l3m6 2022-08-17T13:42:13.7963201Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxzm9l3m6/_remote_module_non_scriptable.py 2022-08-17T13:42:13.9117762Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:13.9118363Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:13.9122051Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:13.9122648Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:14.3185379Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:42:14.3231029Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:42:14.5761151Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:14.5771578Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:14.5947908Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:14.5958845Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:14.6135731Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:14.6148111Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:14.6323796Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:14.6335339Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:14.6512269Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:14.6524324Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:14.6701073Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:14.6712921Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:14.6890279Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:14.6901152Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:14.7182551Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:14.7201852Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:15.1874637Z ok (4.617s) 2022-08-17T13:42:15.1907353Z test_local_optimizer_parity_optimizer_class_str_AdamW_maximize_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:42:15.1921442Z When combined with DDP, check that a local optimizer gives the same ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79342 2022-08-17T13:42:15.1929179Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79343 2022-08-17T13:42:16.8520610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:16.8521169Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:16.8522237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:16.8522741Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:16.8803539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:16.8804024Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:16.8806343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:16.8806823Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:17.0178105Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:42:17.0180186Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:42:17.0526934Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:42:17.0529957Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:42:17.0530726Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:17.0589172Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:18.3910685Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpviefbvou 2022-08-17T13:42:18.3911774Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpviefbvou/_remote_module_non_scriptable.py 2022-08-17T13:42:18.4243745Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj7trccv_ 2022-08-17T13:42:18.4245375Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj7trccv_/_remote_module_non_scriptable.py 2022-08-17T13:42:18.5341434Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:18.5342019Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:18.5345105Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:18.5345665Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:18.9615618Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:42:18.9670301Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:42:19.2148475Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:19.2156768Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:19.2335620Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:19.2344586Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:19.2523760Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:19.2532530Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:19.2711218Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:19.2720237Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:19.2898834Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:19.2908127Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:19.3086681Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:19.3099253Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:19.3278438Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:19.3288081Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:19.3567495Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:19.3592334Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:19.9047552Z ok (4.717s) 2022-08-17T13:42:19.9079747Z test_local_optimizer_parity_optimizer_class_str_Adam_maximize_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:42:19.9094177Z When combined with DDP, check that a local optimizer gives the same ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79425 2022-08-17T13:42:19.9101991Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79426 2022-08-17T13:42:21.5810330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:21.5810843Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:21.5811989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:21.5812480Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:21.6366137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:21.6366904Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:21.6368737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:21.6369328Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:21.7699970Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:42:21.7701485Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:42:21.8104810Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:42:21.8108868Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:42:21.8110253Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:21.8111193Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:23.1531192Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmlujgbc5 2022-08-17T13:42:23.1532092Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmlujgbc5/_remote_module_non_scriptable.py 2022-08-17T13:42:23.1962028Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu6buydbr 2022-08-17T13:42:23.1963453Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu6buydbr/_remote_module_non_scriptable.py 2022-08-17T13:42:23.3149409Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:23.3150026Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:23.3153654Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:23.3154366Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:23.7196296Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:42:23.7318783Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:42:23.9794802Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:23.9805076Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:23.9981095Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:23.9992840Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:24.0169377Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:24.0181167Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:24.0358451Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:24.0370492Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:24.0547319Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:24.0559469Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:24.0737109Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:24.0749552Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:24.0931760Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:24.0941934Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:24.1226173Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:24.1242014Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:24.6221258Z ok (4.717s) 2022-08-17T13:42:24.6251849Z test_local_optimizer_parity_optimizer_class_str_Adam_maximize_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:42:24.6266772Z When combined with DDP, check that a local optimizer gives the same ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79508 2022-08-17T13:42:24.6274279Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79509 2022-08-17T13:42:26.3206294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:26.3207343Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:26.3208508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:26.3209453Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:26.4269436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:26.4270379Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:26.4271462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:26.4271961Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:26.4855228Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:42:26.4859048Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:42:26.5971143Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:42:26.5974750Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:42:26.5976228Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:26.5979562Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:27.9183109Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf71v7en3 2022-08-17T13:42:27.9184477Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf71v7en3/_remote_module_non_scriptable.py 2022-08-17T13:42:27.9528244Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg7n9b32x 2022-08-17T13:42:27.9530519Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg7n9b32x/_remote_module_non_scriptable.py 2022-08-17T13:42:28.0693782Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:28.0694871Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:28.0696467Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:28.0697429Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:28.4963241Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:42:28.5001738Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:42:28.7477815Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:28.7486962Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:28.7665047Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:28.7674435Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:28.7853347Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:28.7862251Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:28.8040706Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:28.8050511Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:28.8229556Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:28.8239227Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:28.8419841Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:28.8426853Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:28.8606149Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:28.8614248Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:28.8891147Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:28.8911260Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:29.3389890Z ok (4.717s) 2022-08-17T13:42:29.3420505Z test_local_optimizer_parity_optimizer_class_str_SGD_maximize_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:42:29.3435241Z When combined with DDP, check that a local optimizer gives the same ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79591 2022-08-17T13:42:29.3442910Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79592 2022-08-17T13:42:31.0006139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:31.0006640Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:31.0007582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:31.0008068Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:31.0222049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:31.0222514Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:31.0225255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:31.0225742Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:31.1670323Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:42:31.1672986Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:42:31.1947187Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:42:31.1950825Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:42:31.1951940Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:31.1979497Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:32.5031669Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz35pp9lp 2022-08-17T13:42:32.5032298Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz35pp9lp/_remote_module_non_scriptable.py 2022-08-17T13:42:32.5423611Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpksdgp7u4 2022-08-17T13:42:32.5426175Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpksdgp7u4/_remote_module_non_scriptable.py 2022-08-17T13:42:32.6641970Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:32.6642925Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:32.6648234Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:32.6648815Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:33.0809014Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:42:33.0883424Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:42:33.3234948Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:33.3247178Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:33.3414044Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:33.3427815Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:33.3594723Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:33.3608221Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:33.3775136Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:33.3787674Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:33.3954733Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:33.3967551Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:33.4134102Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:33.4146874Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:33.4314714Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:33.4326283Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:33.4523945Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:33.4525522Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:33.9557287Z ok (4.617s) 2022-08-17T13:42:33.9589993Z test_local_optimizer_parity_optimizer_class_str_SGD_maximize_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:42:33.9606669Z When combined with DDP, check that a local optimizer gives the same ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79674 2022-08-17T13:42:33.9615082Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79675 2022-08-17T13:42:35.6138620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:35.6139277Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:35.6140167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:35.6140649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:35.6397860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:35.6398430Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:35.6400472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:35.6400969Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:35.7769137Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:42:35.7771885Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:42:35.8095578Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:42:35.8099622Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:42:35.8100353Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:35.8180444Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:37.1364765Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgjcblm5w 2022-08-17T13:42:37.1365454Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgjcblm5w/_remote_module_non_scriptable.py 2022-08-17T13:42:37.1641434Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps0rlug34 2022-08-17T13:42:37.1644060Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps0rlug34/_remote_module_non_scriptable.py 2022-08-17T13:42:37.2769663Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:37.2770274Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:37.2781886Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:42:37.2782448Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:42:37.6767946Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:42:37.6888904Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:42:37.9381401Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:37.9402493Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:37.9572198Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:37.9593057Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:37.9762368Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:37.9783411Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:37.9952774Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:37.9973882Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:38.0142645Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:38.0164510Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:38.0332687Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:38.0354610Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:38.0523732Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:38.0545230Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:38.0754646Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:38.0755543Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-08-17T13:42:38.5724859Z ok (4.617s) 2022-08-17T13:42:38.5734292Z test_lr_scheduler (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:42:38.5750063Z Check that a normal PyTorch ``lr_scheduler`` is usable with ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79757 2022-08-17T13:42:38.5758479Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79758 2022-08-17T13:42:40.2304326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:40.2305280Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:40.2306482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:40.2307459Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:40.2548804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:40.2549756Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:40.2551240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:40.2552193Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:40.3959857Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:42:40.3962323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:42:40.4265501Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:42:40.4270095Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:42:40.4271485Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:40.4272772Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:43.2880147Z ok (4.715s) 2022-08-17T13:42:43.2899990Z test_multiple_param_groups (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:42:43.2915574Z Check parity between constructing ZeRO with multiple parameter groups ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79841 2022-08-17T13:42:43.3073094Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79842 2022-08-17T13:42:44.9446954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:44.9447468Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:44.9448236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:44.9449011Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:44.9745475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:44.9745938Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:44.9748623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:44.9749108Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:45.1086672Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:42:45.1089666Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:42:45.1460861Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:42:45.1464655Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:42:45.1465354Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:45.1497655Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:48.5201169Z ok (5.232s) 2022-08-17T13:42:48.5225684Z test_nondefault_process_group (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:42:48.5243789Z Check that ZeroRedundancyOptimizer works with a non-default process ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79925 2022-08-17T13:42:48.5251721Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79926 2022-08-17T13:42:50.1632765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:50.1633710Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:50.1634913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:50.1635850Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:50.1853931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:50.1854842Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:50.1856048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:50.1856561Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:50.3283012Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:42:50.3283791Z INFO:torch.testing._internal.common_distributed:Skipping `test_nondefault_process_group()` since world size of 2 is less than 4 2022-08-17T13:42:50.3575725Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:42:50.3576285Z INFO:torch.testing._internal.common_distributed:Skipping `test_nondefault_process_group()` since world size of 2 is less than 4 2022-08-17T13:42:50.6316145Z ok (2.111s) 2022-08-17T13:42:50.6322422Z test_sharding (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:42:50.6325514Z Check ZeroRedundancyOptimizer's parameter sharding at construction ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/67295 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-08-17T13:42:50.6338737Z test_step (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:42:50.6355439Z Check that ZeroRedundancyOptimizer properly exposes the ``step()`` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79993 2022-08-17T13:42:50.6363743Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79994 2022-08-17T13:42:52.2663892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:52.2664891Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:52.2666075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:52.2666995Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:52.2766790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:52.2768152Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:52.2769356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:52.2770289Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:52.4327108Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:42:52.4329784Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:42:52.4482317Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:42:52.4485819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:42:52.4486574Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:52.4535936Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:54.7468209Z ok (4.114s) 2022-08-17T13:42:54.7483844Z test_step_with_closure (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:42:54.7499199Z Check that ZeroRedundancyOptimizer properly exposes the ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80076 2022-08-17T13:42:54.7507819Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80077 2022-08-17T13:42:56.4369785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:56.4370327Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:56.4370926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:56.4371404Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:56.4573092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:42:56.4573564Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:42:56.4575975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:42:56.4576458Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:42:56.6009615Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:42:56.6012367Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:42:56.6277681Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:42:56.6281426Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:42:56.6282179Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:56.6318740Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:42:58.9611633Z ok (4.214s) 2022-08-17T13:42:58.9614264Z test_zero_join_cpu (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:42:58.9630145Z Check that the ZeRO join hook allows training with uneven inputs ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80159 2022-08-17T13:42:58.9639486Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80160 2022-08-17T13:43:00.6338271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:00.6338772Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:00.6339822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:00.6340665Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:00.6605606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:00.6606071Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:00.6608451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:00.6608928Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:00.8047053Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:43:00.8307284Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:43:00.8422898Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:43:00.8423413Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:43:00.8424475Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:43:00.8425173Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:43:00.8948930Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcgoc2ak3 2022-08-17T13:43:00.8950128Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcgoc2ak3/_remote_module_non_scriptable.py 2022-08-17T13:43:00.8960747Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppqfsic_m 2022-08-17T13:43:00.8963596Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppqfsic_m/_remote_module_non_scriptable.py 2022-08-17T13:43:00.9194188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:43:00.9194697Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:43:00.9602153Z /opt/conda/lib/python3.10/tempfile.py:837: ResourceWarning: Implicitly cleaning up 2022-08-17T13:43:00.9602632Z _warnings.warn(warn_message, ResourceWarning) 2022-08-17T13:43:00.9603216Z /opt/conda/lib/python3.10/tempfile.py:837: ResourceWarning: Implicitly cleaning up 2022-08-17T13:43:00.9603650Z _warnings.warn(warn_message, ResourceWarning) 2022-08-17T13:43:01.2700452Z ok (2.309s) 2022-08-17T13:43:01.2702852Z test_zero_join_gpu (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:43:01.2718576Z Check that the ZeRO join hook allows training with uneven inputs ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80237 2022-08-17T13:43:01.2727534Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80238 2022-08-17T13:43:03.0036961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:03.0037453Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:03.0038534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:03.0039038Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:03.0039628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:03.0040055Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:03.0042362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:03.0042848Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:03.1725437Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:43:03.1733091Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:43:03.1865697Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:43:03.1875231Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:43:03.1876012Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:43:03.1938325Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:43:04.5116751Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8s9v0or9 2022-08-17T13:43:04.5117707Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8s9v0or9/_remote_module_non_scriptable.py 2022-08-17T13:43:04.5330320Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcue_yb3o 2022-08-17T13:43:04.5332649Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcue_yb3o/_remote_module_non_scriptable.py 2022-08-17T13:43:05.5711066Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:43:05.5711672Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:43:05.5712378Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:43:05.5712922Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:43:05.9814693Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:43:05.9815223Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:43:06.0544820Z /opt/conda/lib/python3.10/tempfile.py:837: ResourceWarning: Implicitly cleaning up 2022-08-17T13:43:06.0545373Z _warnings.warn(warn_message, ResourceWarning) 2022-08-17T13:43:06.0545981Z /opt/conda/lib/python3.10/tempfile.py:837: ResourceWarning: Implicitly cleaning up 2022-08-17T13:43:06.0546419Z _warnings.warn(warn_message, ResourceWarning) 2022-08-17T13:43:06.5859601Z ok (5.316s) 2022-08-17T13:43:06.5865658Z test_zero_model_parallel_parameters_as_bucket_view_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:43:06.5881931Z Check that ZeRO works with model parallelism where the model's ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80321 2022-08-17T13:43:06.5890319Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80322 2022-08-17T13:43:08.2760807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:08.2761365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:08.2762233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:08.2762760Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:08.2886204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:08.2886691Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:08.2889717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:08.2890204Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:08.4415138Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:43:08.4594752Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:43:08.7955682Z skip: Need at least 4 CUDA devices (2.209s) 2022-08-17T13:43:08.7960888Z test_zero_model_parallel_parameters_as_bucket_view_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-08-17T13:43:08.7976937Z Check that ZeRO works with model parallelism where the model's ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80389 2022-08-17T13:43:08.7985653Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80390 2022-08-17T13:43:10.4334766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:10.4335283Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:10.4335890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:10.4336383Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:10.4347939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:10.4348396Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:10.4351287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:10.4351773Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:10.6022553Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:43:10.6084931Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:43:10.9045445Z skip: Need at least 4 CUDA devices (2.109s) 2022-08-17T13:43:10.9060150Z test_constructor (__main__.TestZeroRedundancyOptimizerSingleRank) 2022-08-17T13:43:10.9076207Z Check the robustness of the ZeroRedundancyOptimizer constructor by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80457 2022-08-17T13:43:12.5256607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:12.5257125Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:12.5258002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:12.5258481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:12.7140631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:43:12.7143906Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:43:12.7145043Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:43:13.0126519Z ok (2.108s) 2022-08-17T13:43:13.0136178Z test_lr_scheduler (__main__.TestZeroRedundancyOptimizerSingleRank) 2022-08-17T13:43:13.0152860Z Check that a normal PyTorch ``lr_scheduler`` is usable with ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80492 2022-08-17T13:43:14.6474463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:14.6474970Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:14.6475781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:14.6476263Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:14.8350377Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:43:14.8353780Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:43:14.8354893Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:43:16.5236912Z ok (3.511s) 2022-08-17T13:43:16.5243514Z test_same_dense_param_type (__main__.TestZeroRedundancyOptimizerSingleRank) 2022-08-17T13:43:16.5260195Z Check that ZeroRedundancyOptimizer raises an exception if the input ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80534 2022-08-17T13:43:18.1606941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:18.1607461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:18.1608494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:18.1608957Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:18.3506240Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:43:18.3509600Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:43:18.3510482Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:43:18.6311012Z ok (2.107s) 2022-08-17T13:43:18.6331950Z test_state_dict (__main__.TestZeroRedundancyOptimizerSingleRank) 2022-08-17T13:43:18.6348034Z Check that ZeroRedundancyOptimizer exposes the expected state dict ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80569 2022-08-17T13:43:20.3042905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:20.3043401Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:20.3044560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:20.3045037Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:20.4948849Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:43:20.4951917Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:43:20.4953046Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:43:22.1430981Z ok (3.512s) 2022-08-17T13:43:22.1439182Z test_step_with_extra_inner_key (__main__.TestZeroRedundancyOptimizerSingleRank) 2022-08-17T13:43:22.1455796Z Check that ZeroRedundancyOptimizer wrapping an optimizer that adds ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80611 2022-08-17T13:43:23.7631855Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:23.7632380Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:23.7633687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:23.7634171Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:23.9505521Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:43:23.9509086Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:43:23.9510125Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:43:25.6541854Z ok (3.511s) 2022-08-17T13:43:25.6550048Z test_step_with_kwargs (__main__.TestZeroRedundancyOptimizerSingleRank) 2022-08-17T13:43:25.6566031Z Check that the ``step(**kwargs)`` interface is properly exposed. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80653 2022-08-17T13:43:27.3026453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:27.3026983Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:27.3028066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:27.3028538Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:27.4907224Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:43:27.4910352Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:43:27.4911353Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:43:29.1649163Z ok (3.511s) 2022-08-17T13:43:29.1655050Z test_step_without_closure (__main__.TestZeroRedundancyOptimizerSingleRank) 2022-08-17T13:43:29.1671445Z Check that the ``step()`` method (without closure) is handled as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80695 2022-08-17T13:43:30.8147449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:30.8147959Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:30.8150553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:30.8151446Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:31.0045383Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:43:31.0048389Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:43:31.0049163Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:43:32.6756601Z ok (3.511s) 2022-08-17T13:43:32.6763024Z test_zero_grad (__main__.TestZeroRedundancyOptimizerSingleRank) 2022-08-17T13:43:32.6779556Z Check that the ``zero_grad`` method is properly handled. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80737 2022-08-17T13:43:34.3233976Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:34.3234470Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:34.3235249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:34.3235708Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:34.5105813Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:43:34.5109385Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:43:34.5110301Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:43:34.7831253Z ok (2.107s) 2022-08-17T13:43:34.7831549Z 2022-08-17T13:43:34.7833567Z ---------------------------------------------------------------------- 2022-08-17T13:43:34.7833958Z Ran 42 tests in 159.571s 2022-08-17T13:43:34.7834135Z 2022-08-17T13:43:34.7834254Z OK (skipped=4) 2022-08-17T13:43:34.7834411Z 2022-08-17T13:43:34.7834524Z Generating XML reports... 2022-08-17T13:43:34.7918709Z Generated XML report: test-reports/python-unittest/distributed.optim.test_zero_redundancy_optimizer/TEST-TestZeroRedundancyOptimizerDistributed-20220817134055.xml 2022-08-17T13:43:34.7929876Z Generated XML report: test-reports/python-unittest/distributed.optim.test_zero_redundancy_optimizer/TEST-TestZeroRedundancyOptimizerSingleRank-20220817134055.xml 2022-08-17T13:43:35.1302697Z Running distributed/fsdp/test_fsdp_optim_state ... [2022-08-17 13:43:35.129807] 2022-08-17T13:43:35.1303748Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_optim_state.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:43:35.129877] 2022-08-17T13:43:36.7535742Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_optim_state 2022-08-17T13:43:36.7556742Z 2022-08-17T13:43:36.7557074Z Running tests... 2022-08-17T13:43:36.7557510Z ---------------------------------------------------------------------- 2022-08-17T13:43:36.7568699Z test_full_optim_state_dict_keys (__main__.TestFSDPOptimState) 2022-08-17T13:43:38.2914152Z Tests that the parameter keys returned by ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:43:38.3099264Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80809 2022-08-17T13:43:38.3105609Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80810 2022-08-17T13:43:39.7443248Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:39.7443989Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:39.7446257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:39.7446728Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:39.7608111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:39.7608581Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:39.7612632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:39.7613116Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:39.9122444Z dist init r=1, world=2 2022-08-17T13:43:39.9126119Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:43:39.9326002Z dist init r=0, world=2 2022-08-17T13:43:39.9330636Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:43:39.9331352Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:43:39.9332049Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:43:41.3217366Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:43:41.3217987Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:43:41.7501723Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:43:41.7502343Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:43:41.7542806Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:43:41.7543543Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:43:42.2206161Z ok (5.465s) 2022-08-17T13:43:42.2213191Z test_full_optim_state_dict_nested_invalid (__main__.TestFSDPOptimState) 2022-08-17T13:43:42.2227031Z Tests that :meth:`full_optim_state_dict` raises an error when ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80892 2022-08-17T13:43:42.2233084Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80893 2022-08-17T13:43:43.7089955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:43.7090807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:43.7093477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:43.7093969Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:43.7200144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:43.7200612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:43.7204802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:43.7205292Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:43.8777725Z dist init r=0, world=2 2022-08-17T13:43:43.8781910Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:43:43.8930065Z dist init r=1, world=2 2022-08-17T13:43:43.8934494Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:43:43.8935851Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:43:43.8986902Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:43:45.2652696Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:43:45.2653260Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:43:45.7101812Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:43:45.7102434Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:43:45.7146261Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:43:45.7146815Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:43:46.2333575Z ok (4.013s) 2022-08-17T13:43:46.2346491Z test_full_optim_state_dict_nested_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-08-17T13:43:46.2360311Z Tests :meth:`full_optim_state_dict` by comparing the returned dict for ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80975 2022-08-17T13:43:46.2366061Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80976 2022-08-17T13:43:47.7467085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:47.7467618Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:47.7470461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:47.7471170Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:47.7608880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:47.7609343Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:47.7611889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:47.7612364Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:47.9148211Z dist init r=0, world=2 2022-08-17T13:43:47.9152354Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:43:47.9322307Z dist init r=1, world=2 2022-08-17T13:43:47.9326886Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:43:47.9327853Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:43:47.9357213Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:43:49.2861612Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:43:49.2862140Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:43:49.7226222Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:43:49.7226829Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:43:49.7293477Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:43:49.7294059Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:43:50.2464875Z ok (4.013s) 2022-08-17T13:43:50.2477714Z test_full_optim_state_dict_nested_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-08-17T13:43:50.2491470Z Tests :meth:`full_optim_state_dict` by comparing the returned dict for ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81058 2022-08-17T13:43:50.2497632Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81059 2022-08-17T13:43:51.7151640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:51.7154169Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:51.7154798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:51.7155277Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:51.7798552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:51.7799013Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:51.7802664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:51.7803135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:51.8911623Z dist init r=0, world=2 2022-08-17T13:43:51.8915347Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:43:51.9452645Z dist init r=1, world=2 2022-08-17T13:43:51.9456840Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:43:51.9457634Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:43:51.9527671Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:43:53.3369153Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:43:53.3369682Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:43:53.7686105Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:43:53.7686723Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:43:53.7736832Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:43:53.7737735Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:43:54.3603700Z ok (4.114s) 2022-08-17T13:43:54.3616547Z test_full_optim_state_dict_nested_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-08-17T13:43:54.3630310Z Tests :meth:`full_optim_state_dict` by comparing the returned dict for ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81141 2022-08-17T13:43:54.3636607Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81142 2022-08-17T13:43:55.8794435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:55.8794916Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:55.8797953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:55.8798453Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:55.9010526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:55.9010967Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:55.9015332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:55.9015810Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:56.0680290Z dist init r=0, world=2 2022-08-17T13:43:56.0684562Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:43:56.0730873Z dist init r=1, world=2 2022-08-17T13:43:56.0735157Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:43:56.0735893Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:43:56.0788297Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:43:57.4565337Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:43:57.4565855Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:43:57.8839402Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:43:57.8839988Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:43:57.8920681Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:43:57.8921255Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:43:58.3736136Z ok (4.013s) 2022-08-17T13:43:58.3749666Z test_full_optim_state_dict_nested_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-08-17T13:43:58.3763004Z Tests :meth:`full_optim_state_dict` by comparing the returned dict for ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81224 2022-08-17T13:43:58.3769261Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81225 2022-08-17T13:43:59.8085512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:59.8086038Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:59.8087664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:59.8088424Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:59.8394126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:43:59.8394605Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:43:59.8398895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:43:59.8399357Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:43:59.9740067Z dist init r=1, world=2 2022-08-17T13:43:59.9743926Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:44:00.0103853Z dist init r=0, world=2 2022-08-17T13:44:00.0108673Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:44:00.0109403Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:00.0152543Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:01.3869643Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:44:01.3870320Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:44:01.8409841Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:01.8410428Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:01.8440294Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:01.8440869Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:02.3865889Z ok (4.013s) 2022-08-17T13:44:02.3878976Z test_full_optim_state_dict_nested_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-08-17T13:44:02.3892296Z Tests :meth:`full_optim_state_dict` by comparing the returned dict for ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81307 2022-08-17T13:44:02.3898085Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81308 2022-08-17T13:44:03.8849514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:03.8850025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:03.8853341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:03.8854175Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:03.9535030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:03.9535505Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:03.9539674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:03.9540165Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:04.0506824Z dist init r=0, world=2 2022-08-17T13:44:04.0510859Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:44:04.1260366Z dist init r=1, world=2 2022-08-17T13:44:04.1265580Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:44:04.1266519Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:04.1325787Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:05.4999511Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:44:05.5000023Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:44:05.9293960Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:05.9295080Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:05.9322780Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:05.9323806Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:06.4996689Z ok (4.113s) 2022-08-17T13:44:06.5009338Z test_full_optim_state_dict_nested_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-08-17T13:44:06.5022556Z Tests :meth:`full_optim_state_dict` by comparing the returned dict for ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81390 2022-08-17T13:44:06.5028825Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81391 2022-08-17T13:44:07.9299399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:07.9299915Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:07.9302347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:07.9302837Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:07.9932991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:07.9933496Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:07.9937359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:07.9937847Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:08.0955460Z dist init r=1, world=2 2022-08-17T13:44:08.0959531Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:44:08.2056066Z dist init r=0, world=2 2022-08-17T13:44:08.2059439Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:44:08.2060614Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:08.2079612Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:09.5804971Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:44:09.5805821Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:44:10.0095649Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:10.0096239Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:10.0123946Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:10.0124495Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:10.5126535Z ok (4.013s) 2022-08-17T13:44:10.5139053Z test_full_optim_state_dict_nested_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-08-17T13:44:10.5152431Z Tests :meth:`full_optim_state_dict` by comparing the returned dict for ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81473 2022-08-17T13:44:10.5158284Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81474 2022-08-17T13:44:11.9630761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:11.9631260Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:11.9633963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:11.9634454Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:11.9827404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:11.9827880Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:11.9832061Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:11.9832531Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:12.1294660Z dist init r=0, world=2 2022-08-17T13:44:12.1298877Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:44:12.1550816Z dist init r=1, world=2 2022-08-17T13:44:12.1555397Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:44:12.1556633Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:12.1605826Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:13.5235523Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:44:13.5236068Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:44:13.9512557Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:13.9513159Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:13.9659235Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:13.9659787Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:14.5256215Z ok (4.013s) 2022-08-17T13:44:14.5269125Z test_full_optim_state_dict_nested_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-08-17T13:44:14.5281983Z Tests :meth:`full_optim_state_dict` by comparing the returned dict for ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81556 2022-08-17T13:44:14.5287992Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81557 2022-08-17T13:44:15.9704629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:15.9705140Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:15.9707502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:15.9707974Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:15.9985034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:15.9985498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:15.9989751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:15.9990220Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:16.1380840Z dist init r=1, world=2 2022-08-17T13:44:16.1384531Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:44:16.1703359Z dist init r=0, world=2 2022-08-17T13:44:16.1707928Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:44:16.1708706Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:16.1792738Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:17.5279400Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:44:17.5280053Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:44:17.9771483Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:17.9772084Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:17.9794722Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:17.9795282Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:18.5387255Z ok (4.013s) 2022-08-17T13:44:18.5397321Z test_rekey_optim_state_dict_to_ids_use_multiple_param_groups_False (__main__.TestFSDPOptimState) 2022-08-17T13:44:18.5410183Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81639 2022-08-17T13:44:18.5416267Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81640 2022-08-17T13:44:19.9991788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:19.9992292Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:19.9994076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:19.9994644Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:20.0119556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:20.0120068Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:20.0123852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:20.0124345Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:20.1655484Z dist init r=0, world=2 2022-08-17T13:44:20.1658982Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:44:20.1838550Z dist init r=1, world=2 2022-08-17T13:44:20.1843298Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:44:20.1844465Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:20.1863390Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:21.5636567Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:44:21.5637094Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:44:21.9947397Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:22.0014398Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:22.0015355Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:22.0015921Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:22.5520085Z ok (4.013s) 2022-08-17T13:44:22.5530322Z test_rekey_optim_state_dict_to_ids_use_multiple_param_groups_True (__main__.TestFSDPOptimState) 2022-08-17T13:44:22.5543226Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81722 2022-08-17T13:44:22.5549429Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81723 2022-08-17T13:44:24.0671917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:24.0672428Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:24.0674803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:24.0675500Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:24.0678112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:24.0678781Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:24.0682822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:24.0683484Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:24.2415027Z dist init r=0, world=2 2022-08-17T13:44:24.2418652Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:44:24.2447925Z dist init r=1, world=2 2022-08-17T13:44:24.2452103Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:44:24.2453248Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:24.2522141Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:25.6277369Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:44:25.6277941Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:44:26.0621146Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:26.0621775Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:26.0666856Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:26.0667457Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:26.6649890Z ok (4.113s) 2022-08-17T13:44:26.6660294Z test_rekey_optim_state_dict_to_names_use_multiple_param_groups_False (__main__.TestFSDPOptimState) 2022-08-17T13:44:26.6674795Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81805 2022-08-17T13:44:26.6681291Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81806 2022-08-17T13:44:28.1103019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:28.1104048Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:28.1105504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:28.1105967Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:28.1632244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:28.1632715Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:28.1636374Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:28.1636842Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:28.2758775Z dist init r=1, world=2 2022-08-17T13:44:28.2762233Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:44:28.3295812Z dist init r=0, world=2 2022-08-17T13:44:28.3299976Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:44:28.3300748Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:28.3374043Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:29.7262429Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:44:29.7262973Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:44:30.1635736Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:30.1636346Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:30.1651688Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:30.1652237Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:30.7784332Z ok (4.113s) 2022-08-17T13:44:30.7789006Z test_scatter_full_optim_state_dict_nested_halve_world_size (__main__.TestFSDPOptimState) 2022-08-17T13:44:30.7803700Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81888 2022-08-17T13:44:30.7809816Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81889 2022-08-17T13:44:32.2764474Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:32.2764968Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:32.2767817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:32.2768303Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:32.3125964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:32.3126451Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:32.3130731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:32.3131210Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:32.4443479Z dist init r=1, world=2 2022-08-17T13:44:32.4447390Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:44:32.4860657Z dist init r=0, world=2 2022-08-17T13:44:32.4865585Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:44:32.4866888Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:32.4957760Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:33.8739993Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:44:33.8740626Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:44:34.3195696Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:34.3196308Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:34.3368107Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:34.3368697Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:34.3851750Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:44:34.3856318Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:44:34.3857072Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:44:34.3954540Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:44:34.9913003Z ok (4.213s) 2022-08-17T13:44:34.9918742Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-08-17T13:44:34.9932448Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81976 2022-08-17T13:44:34.9938471Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81977 2022-08-17T13:44:36.4442909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:36.4443867Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:36.4446113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:36.4447019Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:36.4929389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:36.4930326Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:36.4934419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:36.4935415Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:36.6112282Z dist init r=1, world=2 2022-08-17T13:44:36.6116127Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:44:36.6650215Z dist init r=0, world=2 2022-08-17T13:44:36.6655133Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:44:36.6656259Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:36.6727853Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:38.0487056Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:44:38.0488401Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:44:38.4962425Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:38.4963026Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:38.5051139Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:38.5051700Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:39.1040844Z ok (4.113s) 2022-08-17T13:44:39.1046523Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-08-17T13:44:39.1060054Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82059 2022-08-17T13:44:39.1066641Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82060 2022-08-17T13:44:40.5806728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:40.5807246Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:40.5809771Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:40.5810260Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:40.5930183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:40.5930644Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:40.5935248Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:40.5935740Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:40.7499273Z dist init r=1, world=2 2022-08-17T13:44:40.7503234Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:44:40.7660631Z dist init r=0, world=2 2022-08-17T13:44:40.7665557Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:44:40.7666949Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:40.7707561Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:42.1541813Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:44:42.1542369Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:44:42.5916095Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:42.5917540Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:42.5960701Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:42.5961807Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:43.2169069Z ok (4.113s) 2022-08-17T13:44:43.2174835Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-08-17T13:44:43.2189027Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82142 2022-08-17T13:44:43.2194576Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82143 2022-08-17T13:44:44.7130209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:44.7131220Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:44.7132677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:44.7133595Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:44.7219651Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:44.7220511Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:44.7224127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:44.7225100Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:44.8837204Z dist init r=0, world=2 2022-08-17T13:44:44.8841140Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:44:44.8985480Z dist init r=1, world=2 2022-08-17T13:44:44.8990773Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:44:44.8991942Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:44.9046310Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:46.2705229Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:44:46.2706153Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:44:46.7045904Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:46.7047000Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:46.7065584Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:46.7066872Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:47.3295376Z ok (4.112s) 2022-08-17T13:44:47.3300838Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-08-17T13:44:47.3314981Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82225 2022-08-17T13:44:47.3320508Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82226 2022-08-17T13:44:48.7648577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:48.7649278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:48.7651602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:48.7652104Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:48.7929680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:48.7930148Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:48.7934547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:48.7935015Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:48.9314344Z dist init r=1, world=2 2022-08-17T13:44:48.9318060Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:44:48.9652678Z dist init r=0, world=2 2022-08-17T13:44:48.9657392Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:44:48.9658117Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:48.9726801Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:50.3453268Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:44:50.3453791Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:44:50.7675846Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:50.7676479Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:50.7822217Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:50.7822784Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:51.3419401Z ok (4.012s) 2022-08-17T13:44:51.3425083Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-08-17T13:44:51.3438572Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82308 2022-08-17T13:44:51.3444366Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82309 2022-08-17T13:44:52.8250326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:52.8250854Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:52.8253192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:52.8253659Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:52.8609753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:52.8610223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:52.8614541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:52.8615021Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:52.9916282Z dist init r=0, world=2 2022-08-17T13:44:52.9920191Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:44:53.0332224Z dist init r=1, world=2 2022-08-17T13:44:53.0336977Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:44:53.0337867Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:53.0430611Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:54.4014257Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:44:54.4014890Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:44:54.8473985Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:54.8474586Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:54.8531903Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:54.8532474Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:55.4547392Z ok (4.113s) 2022-08-17T13:44:55.4553085Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-08-17T13:44:55.4566755Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82391 2022-08-17T13:44:55.4572605Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82392 2022-08-17T13:44:56.8885721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:56.8886219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:56.8888710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:56.8889207Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:56.9458423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:44:56.9458883Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:44:56.9463086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:44:56.9463558Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:44:57.0560033Z dist init r=1, world=2 2022-08-17T13:44:57.0563752Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:44:57.1148582Z dist init r=0, world=2 2022-08-17T13:44:57.1153038Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:44:57.1153774Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:57.1175293Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:44:58.4672726Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:44:58.4673303Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:44:58.9080021Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:58.9080615Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:58.9169817Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:44:58.9170641Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:44:59.5684486Z ok (4.114s) 2022-08-17T13:44:59.5690191Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-08-17T13:44:59.5703258Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82474 2022-08-17T13:44:59.5709642Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82475 2022-08-17T13:45:01.0281222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:01.0281970Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:01.0283779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:01.0284308Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:01.0529578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:01.0530047Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:01.0534349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:01.0534834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:01.1949071Z dist init r=0, world=2 2022-08-17T13:45:01.1952552Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:45:01.2273357Z dist init r=1, world=2 2022-08-17T13:45:01.2278189Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:45:01.2278957Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:01.2361041Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:02.5996047Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:45:02.5996568Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:45:03.0123648Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:03.0124247Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:03.0362438Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:03.0363025Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:03.6812705Z ok (4.113s) 2022-08-17T13:45:03.6818133Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-08-17T13:45:03.6832202Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82557 2022-08-17T13:45:03.6838253Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82558 2022-08-17T13:45:05.1526084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:05.1526601Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:05.1529419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:05.1529938Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:05.1706354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:05.1706849Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:05.1711409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:05.1711903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:05.3293302Z dist init r=0, world=2 2022-08-17T13:45:05.3296867Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:45:05.3434136Z dist init r=1, world=2 2022-08-17T13:45:05.3438859Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:45:05.3439825Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:05.3502411Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:06.7330546Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:45:06.7331067Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:45:07.1627226Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:07.1627813Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:07.1676421Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:07.1677013Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:07.7938723Z ok (4.113s) 2022-08-17T13:45:07.7942253Z test_scatter_full_optim_state_dict_transformer (__main__.TestFSDPOptimState) 2022-08-17T13:45:07.7956115Z Tests :meth:`scatter_full_optim_state_dict` for an FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82640 2022-08-17T13:45:07.7961824Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82641 2022-08-17T13:45:09.3014391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:09.3014898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:09.3017002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:09.3017472Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:09.3022764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:09.3023237Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:09.3027957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:09.3028418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:09.4736446Z dist init r=1, world=2 2022-08-17T13:45:09.4739974Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:45:09.4815740Z dist init r=0, world=2 2022-08-17T13:45:09.4820378Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:45:09.4821157Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:09.4843262Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:10.8838271Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:45:10.8838827Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:45:10.9268573Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:10.9269147Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:10.9305284Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:10.9305831Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:11.5871579Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:45:11.5874283Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:45:11.5875146Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:45:11.5972808Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:45:12.3075176Z ok (4.513s) 2022-08-17T13:45:12.3079566Z test_shard_full_optim_state_dict_nested_halve_world_size (__main__.TestFSDPOptimState) 2022-08-17T13:45:12.3093915Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82728 2022-08-17T13:45:12.3099753Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82729 2022-08-17T13:45:13.7519517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:13.7520006Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:13.7522616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:13.7523275Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:13.7984947Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:13.7985413Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:13.7990383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:13.7990872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:13.9214828Z dist init r=0, world=2 2022-08-17T13:45:13.9217907Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:45:14.0097278Z dist init r=1, world=2 2022-08-17T13:45:14.0101500Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:45:14.0102328Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:14.0134863Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:15.3731944Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:45:15.3732896Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:45:15.7986422Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:15.7987566Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:15.8024818Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:15.8026136Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:15.8487496Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:45:15.8492408Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:45:15.8493131Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:45:15.8591455Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:45:16.4201737Z ok (4.113s) 2022-08-17T13:45:16.4208222Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-08-17T13:45:16.4222002Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82816 2022-08-17T13:45:16.4228180Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82817 2022-08-17T13:45:17.8568562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:17.8569079Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:17.8571799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:17.8572270Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:17.8572859Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:17.8573328Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:17.8577636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:17.8578296Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:18.0301708Z dist init r=1, world=2 2022-08-17T13:45:18.0302001Z dist init r=0, world=2 2022-08-17T13:45:18.0305878Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:45:18.0306687Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:45:18.0307429Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:18.0308161Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:19.4041037Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:45:19.4041571Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:45:19.8340435Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:19.8341448Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:19.8342775Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:19.8344288Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:20.4326986Z ok (4.012s) 2022-08-17T13:45:20.4332797Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-08-17T13:45:20.4347454Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82899 2022-08-17T13:45:20.4353339Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82900 2022-08-17T13:45:21.8683116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:21.8683623Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:21.8686149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:21.8686643Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:21.8936044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:21.8936786Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:21.8940965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:21.8941462Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:22.0348293Z dist init r=0, world=2 2022-08-17T13:45:22.0352025Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:45:22.0653185Z dist init r=1, world=2 2022-08-17T13:45:22.0658133Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:45:22.0658883Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:22.0659583Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:23.4596220Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:45:23.4596987Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:45:23.9091015Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:23.9091607Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:23.9095854Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:23.9096422Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:24.5457426Z ok (4.113s) 2022-08-17T13:45:24.5462786Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-08-17T13:45:24.5476821Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82982 2022-08-17T13:45:24.5482341Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82983 2022-08-17T13:45:26.0100478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:26.0100979Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:26.0103582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:26.0104068Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:26.0198257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:26.0198722Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:26.0203061Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:26.0203529Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:26.1775374Z dist init r=0, world=2 2022-08-17T13:45:26.1778734Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:45:26.1927162Z dist init r=1, world=2 2022-08-17T13:45:26.1931772Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:45:26.1932772Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:26.1983978Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:27.5639861Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:45:27.5640687Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:45:28.0082523Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:28.0083140Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:28.0181605Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:28.0182176Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:28.6592219Z ok (4.113s) 2022-08-17T13:45:28.6597761Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-08-17T13:45:28.6611513Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83065 2022-08-17T13:45:28.6617407Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83066 2022-08-17T13:45:30.1176152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:30.1176657Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:30.1179071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:30.1179561Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:30.1372662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:30.1373124Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:30.1377415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:30.1377905Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:30.2863711Z dist init r=0, world=2 2022-08-17T13:45:30.2867191Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:45:30.3098608Z dist init r=1, world=2 2022-08-17T13:45:30.3103270Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:45:30.3104017Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:30.3174255Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:31.6823651Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:45:31.6824927Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:45:32.1132297Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:32.1133199Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:32.1172916Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:32.1173486Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:32.6718290Z ok (4.012s) 2022-08-17T13:45:32.6723785Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-08-17T13:45:32.6737232Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83148 2022-08-17T13:45:32.6742911Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83149 2022-08-17T13:45:34.1269680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:34.1270179Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:34.1273029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:34.1273497Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:34.1416584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:34.1417041Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:34.1421100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:34.1421572Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:34.2945369Z dist init r=1, world=2 2022-08-17T13:45:34.2949225Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:45:34.3135062Z dist init r=0, world=2 2022-08-17T13:45:34.3139578Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:45:34.3140336Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:34.3154247Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:35.7059139Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:45:35.7059831Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:45:36.1349528Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:36.1350149Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:36.1430179Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:36.1430760Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:36.7840899Z ok (4.112s) 2022-08-17T13:45:36.7846245Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-08-17T13:45:36.7859607Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83231 2022-08-17T13:45:36.7865607Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83232 2022-08-17T13:45:38.2264513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:38.2265218Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:38.2267670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:38.2268155Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:38.2326846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:38.2327289Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:38.2331350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:38.2331825Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:38.3943711Z dist init r=1, world=2 2022-08-17T13:45:38.3947674Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:45:38.4014380Z dist init r=0, world=2 2022-08-17T13:45:38.4018776Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:45:38.4019573Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:38.4050722Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:39.7834034Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:45:39.7834567Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:45:40.2132549Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:40.2133159Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:40.2197272Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:40.2197833Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:40.7960224Z ok (4.012s) 2022-08-17T13:45:40.7965853Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-08-17T13:45:40.7979414Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83314 2022-08-17T13:45:40.7985067Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83315 2022-08-17T13:45:42.2547163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:42.2547676Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:42.2548641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:42.2549150Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:42.2633026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:42.2633547Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:42.2636842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:42.2637480Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:42.4199066Z dist init r=1, world=2 2022-08-17T13:45:42.4203148Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:45:42.4313743Z dist init r=0, world=2 2022-08-17T13:45:42.4318312Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:45:42.4319389Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:42.4408232Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:43.7952359Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:45:43.7952962Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:45:44.2377186Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:44.2378146Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:44.2421856Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:44.2422423Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:44.8081195Z ok (4.012s) 2022-08-17T13:45:44.8086533Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-08-17T13:45:44.8099721Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83397 2022-08-17T13:45:44.8105241Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83398 2022-08-17T13:45:46.2934201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:46.2934729Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:46.2936320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:46.2936804Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:46.3139740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:46.3140204Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:46.3144411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:46.3144896Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:46.4590797Z dist init r=0, world=2 2022-08-17T13:45:46.4594604Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:45:46.4872596Z dist init r=1, world=2 2022-08-17T13:45:46.4877250Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:45:46.4878011Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:46.4900950Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:47.8495999Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:45:47.8496498Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:45:48.2789529Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:48.2790124Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:48.2868772Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:48.2869598Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:48.9205156Z ok (4.112s) 2022-08-17T13:45:48.9210446Z test_shard_full_optim_state_dict_transformer (__main__.TestFSDPOptimState) 2022-08-17T13:45:48.9225625Z Tests :meth:`shard_full_optim_state_dict` for an FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83480 2022-08-17T13:45:48.9231550Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83481 2022-08-17T13:45:50.4103121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:50.4103901Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:50.4106888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:50.4107384Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:50.4370813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:50.4371281Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:50.4375610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:50.4376095Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:50.5838234Z dist init r=1, world=2 2022-08-17T13:45:50.5841705Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:45:50.6217584Z dist init r=0, world=2 2022-08-17T13:45:50.6222215Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:45:50.6223021Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:50.6249076Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:51.9967787Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:45:51.9968336Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:45:52.0420688Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:52.0421261Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:52.0428255Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:52.0428852Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:52.6976254Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:45:52.6978400Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:45:52.6979470Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:45:52.7077780Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:45:53.4340747Z ok (4.513s) 2022-08-17T13:45:53.4352987Z test_shard_full_optim_state_dict_unmanaged_params_add_to_fsdp_module_False (__main__.TestFSDPOptimState) 2022-08-17T13:45:53.4365699Z Tests :meth:`shard_full_optim_state_dict` when there are unmanaged ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83568 2022-08-17T13:45:53.4371909Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83569 2022-08-17T13:45:54.8997849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:54.8998389Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:54.9005139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:54.9005637Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:54.9927665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:54.9928146Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:54.9931894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:54.9932575Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:55.0663844Z dist init r=1, world=2 2022-08-17T13:45:55.0668268Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:45:55.1676387Z dist init r=0, world=2 2022-08-17T13:45:55.1680982Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:45:55.1681687Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:55.1685878Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:56.5495099Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:45:56.5495632Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:45:56.9811932Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:56.9812551Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:56.9876926Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:45:56.9877624Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:45:57.5472918Z ok (4.113s) 2022-08-17T13:45:57.5484685Z test_shard_full_optim_state_dict_unmanaged_params_add_to_fsdp_module_True (__main__.TestFSDPOptimState) 2022-08-17T13:45:57.5497786Z Tests :meth:`shard_full_optim_state_dict` when there are unmanaged ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83651 2022-08-17T13:45:57.5503567Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83652 2022-08-17T13:45:59.0105036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:59.0105553Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:59.0108506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:59.0109005Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:59.0945378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:45:59.0945848Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:45:59.0950038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:45:59.0950526Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:45:59.1769450Z dist init r=0, world=2 2022-08-17T13:45:59.1773256Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:45:59.2670343Z dist init r=1, world=2 2022-08-17T13:45:59.2674538Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:45:59.2675794Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:45:59.2689297Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:46:00.6290513Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:00.6291042Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:01.0618558Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:46:01.0619527Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:46:01.0633626Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:46:01.0634194Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:46:01.5605463Z ok (4.013s) 2022-08-17T13:46:01.5605675Z 2022-08-17T13:46:01.5606259Z ---------------------------------------------------------------------- 2022-08-17T13:46:01.5606633Z Ran 35 tests in 144.805s 2022-08-17T13:46:01.5609025Z 2022-08-17T13:46:01.5609403Z OK 2022-08-17T13:46:01.5609602Z 2022-08-17T13:46:01.5609747Z Generating XML reports... 2022-08-17T13:46:01.5696982Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_optim_state/TEST-TestFSDPOptimState-20220817134336.xml 2022-08-17T13:46:01.9215153Z Running distributed/_shard/sharded_tensor/test_sharded_tensor ... [2022-08-17 13:46:01.920978] 2022-08-17T13:46:01.9216022Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/test_sharded_tensor.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:46:01.921050] 2022-08-17T13:46:03.5815545Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor 2022-08-17T13:46:03.5851739Z 2022-08-17T13:46:03.5852061Z Running tests... 2022-08-17T13:46:03.5852534Z ---------------------------------------------------------------------- 2022-08-17T13:46:05.1017273Z test_empty (__main__.TestCreateTensorFromParams) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:46:05.1191258Z ok (1.534s) 2022-08-17T13:46:05.1226643Z test_local_tensor (__main__.TestLocalTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83769 2022-08-17T13:46:05.1233752Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83770 2022-08-17T13:46:05.1240498Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 83771 2022-08-17T13:46:05.1246213Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 83772 2022-08-17T13:46:06.5720101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:06.5720768Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:06.5721731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:06.5722197Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:06.5751560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:06.5752023Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:06.5755328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:06.5756027Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:06.6209552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:06.6210033Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:06.6211579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:06.6212050Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:06.6454556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:06.6455042Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:06.6456382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:06.6456856Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:06.7417404Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:06.7432732Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:06.7939526Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:06.8185031Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:07.2309534Z skip: Need at least 4 CUDA devices (2.111s) 2022-08-17T13:46:07.2330379Z test_local_tensor_error (__main__.TestLocalTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83905 2022-08-17T13:46:07.2336752Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83906 2022-08-17T13:46:07.2342995Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 83907 2022-08-17T13:46:07.2349748Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 83908 2022-08-17T13:46:08.6783492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:08.6784238Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:08.6785130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:08.6785590Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:08.6804903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:08.6805362Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:08.6808434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:08.6808906Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:08.7209724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:08.7210183Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:08.7213006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:08.7213462Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:08.7706502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:08.7706975Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:08.7709308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:08.7709780Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:08.8501308Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:08.8505252Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:08.8905384Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:08.9455314Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:09.3415136Z skip: Need at least 4 CUDA devices (2.111s) 2022-08-17T13:46:09.3435563Z test_collect_local_shard (__main__.TestModuleHookApi) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84041 2022-08-17T13:46:09.3441729Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84042 2022-08-17T13:46:09.3447627Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 84043 2022-08-17T13:46:09.3454065Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 84044 2022-08-17T13:46:10.7695010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:10.7695744Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:10.7696829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:10.7697443Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:10.7817500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:10.7818218Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:10.7820996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:10.7821734Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:10.8409810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:10.8410364Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:10.8412503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:10.8413200Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:10.8426551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:10.8427252Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:10.8430241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:10.8430952Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:10.9357219Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:10.9520555Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:11.0152991Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:11.0174727Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:11.3512388Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:46:11.3535921Z test_reshard_output (__main__.TestModuleHookApi) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84177 2022-08-17T13:46:11.3541880Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84178 2022-08-17T13:46:11.3548289Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 84179 2022-08-17T13:46:11.3554212Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 84180 2022-08-17T13:46:12.8117757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:12.8118599Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:12.8119229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:12.8119711Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:12.8403750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:12.8404221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:12.8407080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:12.8407550Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:12.8425268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:12.8425734Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:12.8428944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:12.8429488Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:12.8557109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:12.8557564Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:12.8560549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:12.8561006Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:12.9819721Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:13.0104399Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:13.0129068Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:13.0285935Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:13.3614177Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:46:13.3633638Z test_create_shard_with_no_placement (__main__.TestShardMetadata) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84313 2022-08-17T13:46:13.3639759Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84314 2022-08-17T13:46:13.3646460Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 84315 2022-08-17T13:46:13.3652949Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 84316 2022-08-17T13:46:14.8043275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:14.8043806Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:14.8044412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:14.8044890Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:14.8089615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:14.8090083Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:14.8093148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:14.8093630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:14.8116781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:14.8117244Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:14.8120502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:14.8121004Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:14.8302689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:14.8303154Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:14.8306353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:14.8306832Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:14.9699236Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:14.9776515Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:14.9822463Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:15.0045650Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:15.3711950Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:46:15.3733181Z test_shard_metadata_init (__main__.TestShardMetadata) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84449 2022-08-17T13:46:15.3739669Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84450 2022-08-17T13:46:15.3746285Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 84451 2022-08-17T13:46:15.3753055Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 84452 2022-08-17T13:46:16.8207918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:16.8208886Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:16.8210060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:16.8211005Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:16.8216963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:16.8217896Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:16.8219045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:16.8219917Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:16.8221096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:16.8222049Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:16.8223235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:16.8224481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:16.8237630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:16.8238534Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:16.8241754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:16.8242729Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:16.9991821Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:17.0063022Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:17.0064446Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:17.0065594Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:17.3811101Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:46:17.3835239Z test_shard_parameter (__main__.TestShardParameter) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84585 2022-08-17T13:46:17.3841133Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84586 2022-08-17T13:46:17.3847438Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 84587 2022-08-17T13:46:17.3853285Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 84588 2022-08-17T13:46:18.8126738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:18.8128022Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:18.8129221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:18.8130148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:18.8290498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:18.8291425Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:18.8293218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:18.8294163Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:18.8576535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:18.8577462Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:18.8579023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:18.8579991Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:18.8931341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:18.8932291Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:18.8933899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:18.8934844Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:18.9824702Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:18.9950106Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:19.0243866Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:19.0663690Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:19.3916112Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:46:19.3943802Z test_shard_parameter_errors (__main__.TestShardParameter) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84721 2022-08-17T13:46:19.3950347Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84722 2022-08-17T13:46:19.3956251Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 84723 2022-08-17T13:46:19.3962389Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 84724 2022-08-17T13:46:20.8722447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:20.8723030Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:20.8724018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:20.8724778Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:20.9132211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:20.9132694Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:20.9135311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:20.9135797Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:20.9163991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:20.9164463Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:20.9167371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:20.9167863Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:20.9234265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:20.9234726Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:20.9237780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:20.9238260Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:21.0397854Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:21.0814339Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:21.0874947Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:21.0965037Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:21.5024141Z skip: Need at least 4 CUDA devices (2.111s) 2022-08-17T13:46:21.5047257Z test_shard_tensor (__main__.TestShardTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84857 2022-08-17T13:46:21.5053637Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84858 2022-08-17T13:46:21.5060319Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 84859 2022-08-17T13:46:21.5067352Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 84860 2022-08-17T13:46:22.9977627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:22.9978128Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:22.9979590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:22.9980084Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:23.0324186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:23.0324650Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:23.0327229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:23.0327686Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:23.0411297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:23.0411748Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:23.0414934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:23.0415390Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:23.0482877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:23.0483394Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:23.0486345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:23.0486826Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:23.1643405Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:23.2063699Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:23.2068498Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:23.2163654Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:23.6129189Z skip: Need at least 4 CUDA devices (2.110s) 2022-08-17T13:46:23.6156308Z test_shard_tensor_errors (__main__.TestShardTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84993 2022-08-17T13:46:23.6162663Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84994 2022-08-17T13:46:23.6169437Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 84995 2022-08-17T13:46:23.6176391Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 84996 2022-08-17T13:46:25.0621285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:25.0621816Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:25.0622897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:25.0623825Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:25.0651306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:25.0651776Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:25.0652337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:25.0652784Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:25.0656802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:25.0657303Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:25.0657919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:25.0658404Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:25.0833001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:25.0833474Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:25.0836216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:25.0836694Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:25.2315437Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:25.2398845Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:25.2403453Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:25.2565820Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:25.6236001Z skip: Need at least 4 CUDA devices (2.011s) 2022-08-17T13:46:25.6256983Z test_cleanup (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85129 2022-08-17T13:46:25.6263416Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85130 2022-08-17T13:46:25.6270742Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 85131 2022-08-17T13:46:25.6277780Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 85132 2022-08-17T13:46:27.0734056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:27.0734543Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:27.0735652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:27.0736404Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:27.0836806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:27.0837255Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:27.0840688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:27.0841187Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:27.0841773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:27.0842200Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:27.0844544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:27.0845029Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:27.1030906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:27.1031355Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:27.1034294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:27.1034775Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:27.2397415Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:27.2576828Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:27.2577616Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:27.2773546Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:27.6337439Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:46:27.6371360Z test_complete_world_size (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85265 2022-08-17T13:46:27.6378019Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85266 2022-08-17T13:46:27.6384180Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 85267 2022-08-17T13:46:27.6390943Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 85268 2022-08-17T13:46:29.0725720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:29.0726236Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:29.0726838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:29.0727298Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:29.0750782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:29.0751466Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:29.0754035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:29.0754502Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:29.1032019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:29.1032478Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:29.1035150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:29.1035609Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:29.1080476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:29.1080936Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:29.1084190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:29.1084655Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:29.2420282Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:29.2450837Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:29.2777631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:29.2791028Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:29.6451516Z skip: Need at least 4 CUDA devices (2.011s) 2022-08-17T13:46:29.6469490Z test_create_sharded_tensor_like (__main__.TestShardedTensorChunked) 2022-08-17T13:46:29.6483480Z Test tensor like methods, i.e. torch.zeros_like(...), torch.full_like, etc. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85401 2022-08-17T13:46:29.6489905Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85402 2022-08-17T13:46:29.6496358Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 85403 2022-08-17T13:46:29.6502958Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 85404 2022-08-17T13:46:31.0807026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:31.0807543Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:31.0808606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:31.0809107Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:31.1294797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:31.1295284Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:31.1297492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:31.1297974Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:31.1350167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:31.1350628Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:31.1353933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:31.1354420Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:31.1424267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:31.1424953Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:31.1428009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:31.1428490Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:31.2485202Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:31.3030952Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:31.3042810Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:31.3122552Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:31.6561635Z skip: Need at least 4 CUDA devices (2.011s) 2022-08-17T13:46:31.6572450Z test_create_sharded_tensor_with_full (__main__.TestShardedTensorChunked) 2022-08-17T13:46:31.6586805Z Test sharded_tensor.full(...) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85537 2022-08-17T13:46:31.6593259Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85538 2022-08-17T13:46:31.6599888Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 85539 2022-08-17T13:46:31.6606600Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 85540 2022-08-17T13:46:33.1173320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:33.1173820Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:33.1175085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:33.1175851Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:33.1464454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:33.1465236Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:33.1468499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:33.1469229Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:33.1731865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:33.1732622Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:33.1735394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:33.1736124Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:33.1745020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:33.1745766Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:33.1748577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:33.1749321Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:33.2901895Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:33.3169266Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:33.3444378Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:33.3462841Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:33.6667842Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:46:33.6676036Z test_create_sharded_tensor_with_ones (__main__.TestShardedTensorChunked) 2022-08-17T13:46:33.6690241Z Test sharded_tensor.ones(...) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85673 2022-08-17T13:46:33.6696325Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85674 2022-08-17T13:46:33.6702933Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 85675 2022-08-17T13:46:33.6709573Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 85676 2022-08-17T13:46:35.1086932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:35.1087439Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:35.1088179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:35.1088873Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:35.1106793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:35.1107253Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:35.1110204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:35.1110681Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:35.1789606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:35.1790074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:35.1792250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:35.1792741Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:35.1965281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:35.1965751Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:35.1968639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:35.1969126Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:35.2794870Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:35.2808727Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:35.3468558Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:35.3703023Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:35.7770003Z skip: Need at least 4 CUDA devices (2.110s) 2022-08-17T13:46:35.7784070Z test_create_sharded_tensor_with_rand (__main__.TestShardedTensorChunked) 2022-08-17T13:46:35.7798247Z Test sharded_tensor.rand(...)/randn(...) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85809 2022-08-17T13:46:35.7804349Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85810 2022-08-17T13:46:35.7810926Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 85811 2022-08-17T13:46:35.7817090Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 85812 2022-08-17T13:46:37.2132405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:37.2132910Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:37.2134027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:37.2134503Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:37.2360004Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:37.2360492Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:37.2363043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:37.2363505Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:37.2455519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:37.2455975Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:37.2458902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:37.2459507Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:37.2604907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:37.2605368Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:37.2608192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:37.2608652Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:37.3817121Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:37.4083749Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:37.4142715Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:37.4334168Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:37.7875143Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:46:37.7882909Z test_create_sharded_tensor_with_zeros (__main__.TestShardedTensorChunked) 2022-08-17T13:46:37.7896113Z Test sharded_tensor.zeros(...) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85945 2022-08-17T13:46:37.7902159Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85946 2022-08-17T13:46:37.7908543Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 85947 2022-08-17T13:46:37.7914759Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 85948 2022-08-17T13:46:39.2262292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:39.2263830Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:39.2265064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:39.2265963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:39.2311732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:39.2312654Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:39.2314384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:39.2315294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:39.2815820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:39.2816771Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:39.2817977Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:39.2818925Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:39.3204157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:39.3205110Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:39.3206305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:39.3207247Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:39.3950886Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:39.3996283Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:39.4507616Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:39.4960107Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:39.8974933Z skip: Need at least 4 CUDA devices (2.110s) 2022-08-17T13:46:39.8982506Z test_gather_even (__main__.TestShardedTensorChunked) 2022-08-17T13:46:39.8996090Z Test _sharded_tensor.gather(...) with evenly distributed._shards ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86081 2022-08-17T13:46:39.9002284Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86082 2022-08-17T13:46:39.9008929Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 86083 2022-08-17T13:46:39.9015284Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 86084 2022-08-17T13:46:41.3449479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:41.3450481Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:41.3451688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:41.3452598Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:41.3581768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:41.3582736Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:41.3585059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:41.3586008Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:41.3624812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:41.3625792Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:41.3628892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:41.3629851Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:41.3825191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:41.3826169Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:41.3828844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:41.3829771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:41.5133665Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:41.5254734Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:41.5344954Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:41.5493469Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:41.9074367Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:46:41.9082525Z test_gather_uneven (__main__.TestShardedTensorChunked) 2022-08-17T13:46:41.9096178Z Test _sharded_tensor.gather(...) with unevenly distributed._shards ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86217 2022-08-17T13:46:41.9102401Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86218 2022-08-17T13:46:41.9108856Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 86219 2022-08-17T13:46:41.9115421Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 86220 2022-08-17T13:46:43.3642574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:43.3643677Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:43.3645114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:43.3646014Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:43.3705423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:43.3706328Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:43.3710059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:43.3711129Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:43.4013674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:43.4014618Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:43.4016167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:43.4017131Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:43.4357661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:43.4358605Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:43.4360229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:43.4361177Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:43.5309850Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:43.5382547Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:43.5695420Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:43.6104734Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:44.0178199Z skip: Need at least 4 CUDA devices (2.110s) 2022-08-17T13:46:44.0204588Z test_insufficient_sharding_dims (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86353 2022-08-17T13:46:44.0210718Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86354 2022-08-17T13:46:44.0216954Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 86355 2022-08-17T13:46:44.0223084Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 86356 2022-08-17T13:46:45.4745200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:45.4746158Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:45.4747313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:45.4748251Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:45.4755973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:45.4756876Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:45.4758841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:45.4759762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:45.4760936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:45.4761756Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:45.4764637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:45.4765825Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:45.4925191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:45.4925957Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:45.4928061Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:45.4928882Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:45.6550766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:45.6551406Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:45.6553181Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:45.6669785Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:46.0282898Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:46:46.0303884Z test_invalid_pg_rpc_ranks (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86489 2022-08-17T13:46:46.0310494Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86490 2022-08-17T13:46:46.0316517Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 86491 2022-08-17T13:46:46.0322934Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 86492 2022-08-17T13:46:47.5492486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:47.5493531Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:47.5494330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:47.5494828Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:47.5935455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:47.5936186Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:47.5937820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:47.5938604Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:47.5968625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:47.5969375Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:47.5971996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:47.5972789Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:47.6135638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:47.6136638Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:47.6139606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:47.6140436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:47.7169052Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:47.7664331Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:47.7707155Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:47.7869589Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:48.1384565Z skip: Need at least 4 CUDA devices (2.110s) 2022-08-17T13:46:48.1418406Z test_invalid_sharding (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86625 2022-08-17T13:46:48.1425232Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86626 2022-08-17T13:46:48.1432402Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 86627 2022-08-17T13:46:48.1439579Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 86628 2022-08-17T13:46:49.6245932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:49.6246409Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:49.6247503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:49.6247977Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:49.6428704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:49.6429161Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:49.6431519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:49.6432000Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:49.6477963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:49.6478424Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:49.6480951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:49.6481427Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:49.6583582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:49.6584038Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:49.6587329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:49.6587799Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:49.7928921Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:49.8097116Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:49.8176620Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:49.8251910Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:50.1497552Z skip: Need at least 4 CUDA devices (2.011s) 2022-08-17T13:46:50.1523414Z test_load_state_dict_errors (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86761 2022-08-17T13:46:50.1529878Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86762 2022-08-17T13:46:50.1536073Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 86763 2022-08-17T13:46:50.1542720Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 86764 2022-08-17T13:46:51.5951911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:51.5952433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:51.5953435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:51.5953952Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:51.5999255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:51.5999729Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:51.6002697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:51.6003177Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:51.6264213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:51.6264683Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:51.6267698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:51.6268177Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:51.6774202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:51.6774694Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:51.6776689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:51.6777180Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:51.7638542Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:51.7655509Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:51.8013436Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:51.8544909Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:52.1600903Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:46:52.1630046Z test_multiple_local_shards (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86897 2022-08-17T13:46:52.1635393Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86898 2022-08-17T13:46:52.1641337Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 86899 2022-08-17T13:46:52.1647333Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 86900 2022-08-17T13:46:53.6260922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:53.6261424Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:53.6262005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:53.6262486Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:53.6268438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:53.6268915Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:53.6272100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:53.6272601Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:53.6530921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:53.6531373Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:53.6533994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:53.6705174Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:53.6705778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:53.6706422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:53.6708971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:53.6709447Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:53.7991168Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:53.8062760Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:53.8195219Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:53.8373837Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:54.1706029Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:46:54.1737255Z test_new_group (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87033 2022-08-17T13:46:54.1743570Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87034 2022-08-17T13:46:54.1750238Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 87035 2022-08-17T13:46:54.1756823Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 87036 2022-08-17T13:46:55.6410804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:55.6411307Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:55.6412133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:55.6412595Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:55.6841351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:55.6841841Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:55.6844507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:55.6844981Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:55.7018334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:55.7018785Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:55.7021666Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:55.7022127Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:55.7110205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:55.7110662Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:55.7113700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:55.7114394Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:55.8088436Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:55.8563495Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:55.8769484Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:55.8836102Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:56.2815840Z skip: Need at least 4 CUDA devices (2.111s) 2022-08-17T13:46:56.2847177Z test_partial_world_size (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87169 2022-08-17T13:46:56.2853280Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87170 2022-08-17T13:46:56.2859601Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 87171 2022-08-17T13:46:56.2866125Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 87172 2022-08-17T13:46:57.6844768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:57.6845287Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:57.6846257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:57.6846764Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:57.7266404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:57.7266867Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:57.7269397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:57.7269890Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:57.7273308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:57.7273750Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:57.7276778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:57.7277252Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:57.7425045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:57.7425485Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:57.7428465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:57.7428944Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:57.8515329Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:57.9005703Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:46:57.9012053Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:46:57.9149485Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:58.2929476Z skip: Need at least 4 CUDA devices (2.011s) 2022-08-17T13:46:58.2956808Z test_sharded_tensor_metadata (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87305 2022-08-17T13:46:58.2962786Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87306 2022-08-17T13:46:58.2968770Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 87307 2022-08-17T13:46:58.2975100Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 87308 2022-08-17T13:46:59.7627259Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:59.7627761Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:59.7628547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:59.7629027Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:59.7661623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:59.7662089Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:59.7665141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:59.7665618Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:59.7780895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:59.7781356Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:59.7784371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:59.7784846Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:59.8328396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:46:59.8328886Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:46:59.8331021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:46:59.8331511Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:46:59.9296625Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:46:59.9340833Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:46:59.9457539Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:00.0081917Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:00.4037991Z skip: Need at least 4 CUDA devices (2.111s) 2022-08-17T13:47:00.4068919Z test_sharded_tensor_sizes (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87441 2022-08-17T13:47:00.4075064Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87442 2022-08-17T13:47:00.4081304Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 87443 2022-08-17T13:47:00.4087682Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 87444 2022-08-17T13:47:01.8443456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:01.8443971Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:01.8444571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:01.8445033Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:01.8540591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:01.8541057Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:01.8544386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:01.8544880Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:01.8934604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:01.8935325Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:01.8936661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:01.8937114Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:01.9187958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:01.9188417Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:01.9190831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:01.9191504Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:02.0134025Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:02.0222525Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:02.0662232Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:02.0881152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:02.4152767Z skip: Need at least 4 CUDA devices (2.011s) 2022-08-17T13:47:02.4177642Z test_sharding_columns (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87577 2022-08-17T13:47:02.4184125Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87578 2022-08-17T13:47:02.4190865Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 87579 2022-08-17T13:47:02.4197012Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 87580 2022-08-17T13:47:03.8615291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:03.8615767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:03.8616373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:03.8616847Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:03.8631602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:03.8632047Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:03.8634848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:03.8635345Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:03.9850752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:03.9851209Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:03.9853399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:03.9853880Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:03.9862155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:03.9862592Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:03.9866581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:03.9867070Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:04.0336662Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:04.0344378Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:04.1673112Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:04.1675101Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:04.5259099Z skip: Need at least 4 CUDA devices (2.111s) 2022-08-17T13:47:04.5283961Z test_state_dict (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87713 2022-08-17T13:47:04.5290606Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87714 2022-08-17T13:47:04.5296714Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 87715 2022-08-17T13:47:04.5303227Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 87716 2022-08-17T13:47:05.9657055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:05.9658135Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:05.9658897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:05.9659371Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:06.0190350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:06.0191295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:06.0193249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:06.0194051Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:06.0234493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:06.0235295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:06.0237256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:06.0238054Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:06.0247402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:06.0248183Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:06.0250892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:06.0251689Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:06.1324646Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:06.1860645Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:06.1945293Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:06.1972181Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:06.5362552Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:47:06.5385503Z test_state_dict_new_group (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87849 2022-08-17T13:47:06.5391991Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87850 2022-08-17T13:47:06.5398313Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 87851 2022-08-17T13:47:06.5404911Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 87852 2022-08-17T13:47:07.9643331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:07.9644097Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:07.9645278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:07.9645791Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:07.9829745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:07.9830195Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:07.9832821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:07.9833303Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:08.0285928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:08.0286520Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:08.0288860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:08.0289339Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:08.0651000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:08.0651449Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:08.0654267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:08.0654754Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:08.1302558Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:08.1491829Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:08.1953913Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:08.2400969Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:08.6467775Z skip: Need at least 4 CUDA devices (2.110s) 2022-08-17T13:47:08.6488743Z test_state_dict_no_sharded_tensors (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87985 2022-08-17T13:47:08.6494977Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87986 2022-08-17T13:47:08.6501545Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 87987 2022-08-17T13:47:08.6508476Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 87988 2022-08-17T13:47:10.0881542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:10.0882045Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:10.0883277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:10.0883756Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:10.1565166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:10.1565724Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:10.1566305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:10.1567062Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:10.1569483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:10.1569991Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:10.1570807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:10.1571301Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:10.1590526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:10.1590970Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:10.1593893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:10.1594367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:10.2547533Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:10.3382313Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:10.3382828Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:10.3439727Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:10.6567589Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:47:10.6589363Z test_custom_op (__main__.TestShardedTensorCustomOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88121 2022-08-17T13:47:10.6594909Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88122 2022-08-17T13:47:10.6601260Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 88123 2022-08-17T13:47:10.6607223Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 88124 2022-08-17T13:47:12.1032612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:12.1033147Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:12.1033735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:12.1034211Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:12.1098575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:12.1099035Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:12.1102109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:12.1102592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:12.1211825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:12.1212298Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:12.1215471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:12.1215965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:12.1579475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:12.1579927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:12.1582610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:12.1583090Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:12.2712900Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:12.2770632Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:12.2939231Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:12.3268559Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:12.6665932Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:47:12.6685888Z test_custom_op_errors (__main__.TestShardedTensorCustomOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88257 2022-08-17T13:47:12.6692111Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88258 2022-08-17T13:47:12.6698594Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 88259 2022-08-17T13:47:12.6705260Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 88260 2022-08-17T13:47:14.1718163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:14.1718951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:14.1719551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:14.1720036Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:14.1918959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:14.1919430Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:14.1922147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:14.1922620Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:14.2074811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:14.2075251Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:14.2078432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:14.2078916Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:14.2493010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:14.2493461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:14.2495849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:14.2496326Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:14.3396090Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:14.3778834Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:14.3954262Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:14.4170018Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:14.7766900Z skip: Need at least 4 CUDA devices (2.110s) 2022-08-17T13:47:14.7790068Z test_custom_op_override (__main__.TestShardedTensorCustomOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88393 2022-08-17T13:47:14.7796473Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88394 2022-08-17T13:47:14.7802722Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 88395 2022-08-17T13:47:14.7809205Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 88396 2022-08-17T13:47:16.2210190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:16.2210913Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:16.2211521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:16.2212268Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:16.2404195Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:16.2404650Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:16.2407479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:16.2407955Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:16.3242267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:16.3242986Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:16.3244226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:16.3244706Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:16.3326444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:16.3326969Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:16.3329646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:16.3330121Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:16.3872451Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:16.4140603Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:16.4949657Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:16.5077455Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:16.8873000Z skip: Need at least 4 CUDA devices (2.110s) 2022-08-17T13:47:16.8884666Z test_create_sharded_tensor_with_ones (__main__.TestShardedTensorEnumerable) 2022-08-17T13:47:16.8898235Z Test sharded_tensor.ones(...) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88529 2022-08-17T13:47:16.8904750Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88530 2022-08-17T13:47:16.8911477Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 88531 2022-08-17T13:47:16.8918069Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 88532 2022-08-17T13:47:18.3344723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:18.3345329Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:18.3346421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:18.3346920Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:18.3363605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:18.3364077Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:18.3367018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:18.3367501Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:18.3438959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:18.3439425Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:18.3442210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:18.3442691Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:18.3567945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:18.3568417Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:18.3571262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:18.3571739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:18.5053119Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:18.5068059Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:18.5123216Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:18.5315641Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:18.8977796Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:47:18.8989938Z test_gather_even (__main__.TestShardedTensorEnumerable) 2022-08-17T13:47:18.9003399Z Test _sharded_tensor.gather(...) with evenly distributed._shards ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88665 2022-08-17T13:47:18.9010051Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88666 2022-08-17T13:47:18.9016183Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 88667 2022-08-17T13:47:18.9022697Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 88668 2022-08-17T13:47:20.3308767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:20.3309775Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:20.3310933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:20.3311899Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:20.3414031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:20.3414961Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:20.3417477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:20.3418445Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:20.3565534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:20.3566440Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:20.3569533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:20.3570538Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:20.3698416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:20.3699347Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:20.3701500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:20.3702451Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:20.4976862Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:20.5090390Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:20.5258444Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:20.5423165Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:20.9082054Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:47:20.9092958Z test_gather_uneven (__main__.TestShardedTensorEnumerable) 2022-08-17T13:47:20.9107224Z Test _sharded_tensor.gather(...) with unevenly distributed._shards ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88801 2022-08-17T13:47:20.9115468Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88802 2022-08-17T13:47:20.9123304Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 88803 2022-08-17T13:47:20.9131497Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 88804 2022-08-17T13:47:22.3571288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:22.3572591Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:22.3573752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:22.3574713Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:22.3591162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:22.3592062Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:22.3595032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:22.3595940Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:22.3790703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:22.3791603Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:22.3793705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:22.3794670Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:22.3901219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:22.3902142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:22.3903770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:22.3904669Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:22.5298778Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:22.5301082Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:22.5457644Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:22.5581480Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:22.9193578Z skip: Need at least 4 CUDA devices (2.011s) 2022-08-17T13:47:22.9228822Z test_grid_sharding (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88937 2022-08-17T13:47:22.9235149Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88938 2022-08-17T13:47:22.9241408Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 88939 2022-08-17T13:47:22.9247553Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 88940 2022-08-17T13:47:24.3617957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:24.3618953Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:24.3620151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:24.3621372Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:24.3727672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:24.3728574Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:24.3731299Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:24.3732255Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:24.3801278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:24.3802174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:24.3805157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:24.3806149Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:24.3941171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:24.3942127Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:24.3944717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:24.3945724Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:24.5284206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:24.5390980Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:24.5481762Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:24.5672260Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:24.9304278Z skip: Need at least 4 CUDA devices (2.011s) 2022-08-17T13:47:24.9340665Z test_multiple_local_shards (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89073 2022-08-17T13:47:24.9346900Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89074 2022-08-17T13:47:24.9353280Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 89075 2022-08-17T13:47:24.9359408Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 89076 2022-08-17T13:47:26.3814903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:26.3815695Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:26.3816326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:26.3816800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:26.4013194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:26.4013675Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:26.4016721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:26.4017435Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:26.4018042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:26.4018491Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:26.4021506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:26.4021989Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:26.4201753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:26.4202241Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:26.4204858Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:26.4205331Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:26.5484816Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:26.5809541Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:26.5810055Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:26.5880251Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:26.9416684Z skip: Need at least 4 CUDA devices (2.011s) 2022-08-17T13:47:26.9450943Z test_new_group (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89209 2022-08-17T13:47:26.9457098Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89210 2022-08-17T13:47:26.9463480Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 89211 2022-08-17T13:47:26.9470193Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 89212 2022-08-17T13:47:28.4313805Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:28.4314744Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:28.4315961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:28.4316950Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:28.4318156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:28.4319034Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:28.4321335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:28.4322251Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:28.4324431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:28.4325281Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:28.4330201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:28.4331166Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:28.4712709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:28.4713649Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:28.4715278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:28.4716265Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:28.6095639Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:28.6103650Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:28.6113852Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:28.6439119Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:29.0530771Z skip: Need at least 4 CUDA devices (2.111s) 2022-08-17T13:47:29.0566690Z test_partial_world_size (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89345 2022-08-17T13:47:29.0572851Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89346 2022-08-17T13:47:29.0579060Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 89347 2022-08-17T13:47:29.0585547Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 89348 2022-08-17T13:47:30.6133891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:30.6134450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:30.6135708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:30.6136547Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:30.6445861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:30.6446519Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:30.6449120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:30.6449843Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:30.6496233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:30.6496922Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:30.6499709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:30.6500431Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:30.6755549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:30.6756267Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:30.6758744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:30.6759448Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:30.7802775Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:30.8156516Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:30.8243473Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:30.8428704Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:31.1646967Z skip: Need at least 4 CUDA devices (2.112s) 2022-08-17T13:47:31.1667017Z test_sharded_tensor_device (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89481 2022-08-17T13:47:31.1673042Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89482 2022-08-17T13:47:31.1679057Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 89483 2022-08-17T13:47:31.1685220Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 89484 2022-08-17T13:47:32.5986591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:32.5987099Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:32.5991766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:32.5992642Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:32.6111114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:32.6111816Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:32.6113989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:32.6114455Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:32.6548158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:32.6548655Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:32.6549875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:32.6550338Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:32.7030251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:32.7030740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:32.7033002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:32.7033465Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:32.7688228Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:32.7784962Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:32.8235814Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:32.8826322Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:33.2753375Z skip: Need at least 4 CUDA devices (2.110s) 2022-08-17T13:47:33.2784102Z test_sharded_tensor_metadata (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89617 2022-08-17T13:47:33.2790469Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89618 2022-08-17T13:47:33.2797047Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 89619 2022-08-17T13:47:33.2803196Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 89620 2022-08-17T13:47:34.7122130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:34.7122681Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:34.7123278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:34.7123756Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:34.7154549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:34.7154997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:34.7157896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:34.7158377Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:34.7158956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:34.7159387Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:34.7162092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:34.7162586Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:34.7455421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:34.7455897Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:34.7459091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:34.7459585Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:34.8816753Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:34.8906092Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:34.8925293Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:34.9228431Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:35.2860575Z skip: Need at least 4 CUDA devices (2.011s) 2022-08-17T13:47:35.2898758Z test_sharded_tensor_to_cpu (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89753 2022-08-17T13:47:35.2904575Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89754 2022-08-17T13:47:35.2910783Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 89755 2022-08-17T13:47:35.2916950Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 89756 2022-08-17T13:47:36.7224001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:36.7225095Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:36.7225878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:36.7226345Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:36.7709773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:36.7710466Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:36.7713030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:36.7713745Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:36.7830544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:36.7831298Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:36.7833988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:36.7834710Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:36.7867991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:36.7868727Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:36.7871668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:36.7872386Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:36.8882865Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:36.9465617Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:36.9537330Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:36.9573562Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:37.2975878Z skip: Need at least 4 CUDA devices (2.011s) 2022-08-17T13:47:37.3003745Z test_sharded_tensor_to_cuda (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89889 2022-08-17T13:47:37.3009391Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89890 2022-08-17T13:47:37.3015788Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 89891 2022-08-17T13:47:37.3021842Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 89892 2022-08-17T13:47:38.7383190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:38.7384096Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:38.7384971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:38.7385456Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:38.7560201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:38.7560961Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:38.7563720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:38.7564202Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:38.8005018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:38.8005493Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:38.8007961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:38.8008430Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:38.8047310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:38.8047782Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:38.8050733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:38.8051208Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:38.9069382Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:38.9308872Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:38.9695510Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:38.9756879Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:39.3078837Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:47:39.3120459Z test_sharded_tensor_to_test (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90025 2022-08-17T13:47:39.3126541Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90026 2022-08-17T13:47:39.3132942Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 90027 2022-08-17T13:47:39.3139160Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 90028 2022-08-17T13:47:40.7645532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:40.7646055Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:40.7646853Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:40.7647336Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:40.8172939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:40.8173452Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:40.8175260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:40.8175962Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:40.8199340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:40.8199810Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:40.8200383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:40.8200836Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:40.8203384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:40.8203908Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:40.8204809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:40.8205322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:40.9344884Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:40.9911091Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:40.9981264Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:41.0017187Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:41.4199784Z skip: Need at least 4 CUDA devices (2.112s) 2022-08-17T13:47:41.4235335Z test_uneven_shards (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90161 2022-08-17T13:47:41.4241542Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90162 2022-08-17T13:47:41.4248106Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 90163 2022-08-17T13:47:41.4254414Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 90164 2022-08-17T13:47:42.8695972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:42.8696504Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:42.8697579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:42.8698093Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:42.8715960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:42.8716414Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:42.8719817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:42.8720376Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:42.9401424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:42.9401879Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:42.9404770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:42.9405280Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:42.9556409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:42.9556863Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:42.9559826Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:42.9560310Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:43.0410418Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:43.0413847Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:43.1144895Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:43.1298236Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:43.5314754Z skip: Need at least 4 CUDA devices (2.111s) 2022-08-17T13:47:43.5350475Z test_with_rpc_names (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90297 2022-08-17T13:47:43.5356390Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90298 2022-08-17T13:47:43.5362705Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 90299 2022-08-17T13:47:43.5368714Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 90300 2022-08-17T13:47:44.9681684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:44.9682652Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:44.9683864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:44.9684791Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:45.0245071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:45.0246014Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:45.0247208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:45.0248168Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:45.0270643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:45.0271548Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:45.0273340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:45.0274276Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:45.0430601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:45.0431519Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:45.0433411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:45.0434381Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:45.1366586Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:45.1943837Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:45.1988048Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:45.2145647Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:45.5426652Z skip: Need at least 4 CUDA devices (2.011s) 2022-08-17T13:47:45.5448651Z test_init_from_local_shards (__main__.TestShardedTensorFromLocalShards) ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78068 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.002s) 2022-08-17T13:47:45.5488602Z test_init_from_local_shards_and_global_metadata (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90433 2022-08-17T13:47:45.5494412Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90434 2022-08-17T13:47:45.5501060Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 90435 2022-08-17T13:47:45.5508486Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 90436 2022-08-17T13:47:46.9920768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:46.9921520Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:46.9922115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:46.9922964Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:46.9923757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:46.9924342Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:46.9924952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:46.9925514Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:46.9945439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:46.9946047Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:46.9948522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:46.9948992Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:47.0142811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:47.0143341Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:47.0146735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:47.0147193Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:47.1715562Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:47.1716117Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:47.1731223Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:47.1875534Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:47.5567446Z skip: Need at least 4 CUDA devices (2.012s) 2022-08-17T13:47:47.5612810Z test_init_from_local_shards_and_global_metadata_invalid_shards (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90569 2022-08-17T13:47:47.5619054Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90570 2022-08-17T13:47:47.5625691Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 90571 2022-08-17T13:47:47.5632543Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 90572 2022-08-17T13:47:49.0697175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:49.0697677Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:49.0698758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:49.0699283Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:49.0832900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:49.0833743Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:49.0835967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:49.0836611Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:49.1090129Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:49.1090801Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:49.1093222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:49.1094097Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:49.1421340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:49.1422113Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:49.1424793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:49.1425578Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:49.2353868Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:49.2534038Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:49.2752383Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:49.3192387Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:49.6694581Z skip: Need at least 4 CUDA devices (2.113s) 2022-08-17T13:47:49.6722229Z test_init_from_local_shards_invalid_local_shards (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90705 2022-08-17T13:47:49.6728485Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90706 2022-08-17T13:47:49.6734834Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 90707 2022-08-17T13:47:49.6741578Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 90708 2022-08-17T13:47:51.1318941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:51.1319883Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:51.1321043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:51.1322018Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:51.1330560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:51.1331423Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:51.1334423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:51.1335393Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:51.1958214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:51.1959131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:51.1960634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:51.1961589Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:51.2126192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:51.2127095Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:51.2129718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:51.2130696Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:51.3039860Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:51.3064395Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:51.3696348Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:51.3809757Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:51.7806302Z skip: Need at least 4 CUDA devices (2.111s) 2022-08-17T13:47:51.7829354Z test_init_from_local_shards_invalid_pin_memory (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90841 2022-08-17T13:47:51.7835497Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90842 2022-08-17T13:47:51.7841484Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 90843 2022-08-17T13:47:51.7847660Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 90844 2022-08-17T13:47:53.2457628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:53.2458139Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:53.2458706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:53.2459167Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:53.2459762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:53.2460242Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:53.2461584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:53.2462054Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:53.2906110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:53.2906572Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:53.2908866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:53.2909349Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:53.3115419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:53.3115882Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:53.3118930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:53.3119413Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:53.4180712Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:53.4185075Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:53.4602349Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:53.4859252Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:53.5018558Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:47:53.5211131Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:47:53.5313752Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-08-17T13:47:53.5314286Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-08-17T13:47:53.5315262Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:47:53.5315969Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:47:53.5316841Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:47:53.5326055Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:47:53.8910135Z skip: Need at least 4 CUDA devices (2.110s) 2022-08-17T13:47:53.8938545Z test_init_from_local_shards_invalid_property_cross_ranks (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90989 2022-08-17T13:47:53.8944787Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90990 2022-08-17T13:47:53.8952486Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 90991 2022-08-17T13:47:53.8958952Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 90992 2022-08-17T13:47:55.3122092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:55.3123080Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:55.3124220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:55.3125201Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:55.3148377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:55.3149310Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:55.3151591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:55.3152554Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:55.3300759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:55.3301678Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:55.3303780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:55.3304804Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:55.3653836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:55.3654793Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:55.3657213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:55.3658216Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:55.4825078Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:55.4842161Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:55.4954013Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:55.5405598Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:55.9016935Z skip: Need at least 4 CUDA devices (2.011s) 2022-08-17T13:47:55.9037307Z test_init_from_local_shards_invalid_shards_gaps (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91125 2022-08-17T13:47:55.9042938Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91126 2022-08-17T13:47:55.9049294Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 91127 2022-08-17T13:47:55.9055435Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 91128 2022-08-17T13:47:57.3591721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:57.3592391Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:57.3593486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:57.3594277Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:57.3867889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:57.3868376Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:57.3870853Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:57.3871314Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:57.4005385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:57.4005851Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:57.4008488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:57.4008954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:57.4505188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:57.4505659Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:57.4508739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:57.4509220Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:57.5245824Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:57.5536502Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:57.5746296Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:47:57.6261437Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:58.0114645Z skip: Need at least 4 CUDA devices (2.110s) 2022-08-17T13:47:58.0136055Z test_init_from_local_shards_invalid_shards_overlap (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91261 2022-08-17T13:47:58.0141898Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91262 2022-08-17T13:47:58.0148704Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 91263 2022-08-17T13:47:58.0155392Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 91264 2022-08-17T13:47:59.4647348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:59.4648040Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:59.4648994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:59.4649480Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:59.4878925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:59.4879926Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:59.4881321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:59.4881779Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:59.4894003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:59.4894459Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:59.4897243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:59.4897702Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:59.4904164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:47:59.4904622Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:47:59.4908403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:47:59.4908862Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:47:59.6385474Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:47:59.6674288Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:47:59.6697267Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:47:59.6762817Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:00.0212996Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:48:00.0240140Z test_init_from_local_shards_new_group (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91397 2022-08-17T13:48:00.0245860Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91398 2022-08-17T13:48:00.0251998Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 91399 2022-08-17T13:48:00.0258325Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 91400 2022-08-17T13:48:01.4712253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:01.4712783Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:01.4713343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:01.4713802Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:01.4714412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:01.4714897Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:01.4715471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:01.4715948Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:01.4742403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:01.4742870Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:01.4746653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:01.4747138Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:01.4913761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:01.4914222Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:01.4917178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:01.4917686Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:01.6527671Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:48:01.6528443Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:01.6529303Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:48:01.6655213Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:48:02.0315466Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:48:02.0336756Z test_local_shards (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91533 2022-08-17T13:48:02.0342751Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91534 2022-08-17T13:48:02.0349122Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 91535 2022-08-17T13:48:02.0355366Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 91536 2022-08-17T13:48:03.4889665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:03.4890205Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:03.4890803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:03.4891276Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:03.5035142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:03.5035624Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:03.5038660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:03.5039142Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:03.5126872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:03.5127326Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:03.5130072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:03.5130552Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:03.5511941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:03.5512537Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:03.5514758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:03.5515236Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:03.6580932Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:48:03.6729985Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:03.6789612Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:48:03.7245339Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:48:04.1415773Z skip: Need at least 4 CUDA devices (2.110s) 2022-08-17T13:48:04.1449382Z test_st_base_init_from_local_shards_and_global_metadata (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91669 2022-08-17T13:48:04.1455743Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91670 2022-08-17T13:48:04.1462412Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 91671 2022-08-17T13:48:04.1468930Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 91672 2022-08-17T13:48:05.5964971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:05.5965473Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:05.5970048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:05.5970540Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:05.5974137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:05.5974855Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:05.5977248Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:05.5977710Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:05.5992848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:05.5993301Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:05.5995981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:05.5996437Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:05.6153252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:05.6153717Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:05.6156478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:05.6156930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:05.7761453Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:48:05.7761971Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:48:05.7779649Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:48:05.7892752Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:06.1529724Z skip: Need at least 4 CUDA devices (2.011s) 2022-08-17T13:48:06.1549086Z test_init_from_local_tensor (__main__.TestShardedTensorFromLocalTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91805 2022-08-17T13:48:06.1555331Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91806 2022-08-17T13:48:06.1561723Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 91807 2022-08-17T13:48:06.1568083Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 91808 2022-08-17T13:48:07.6207261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:07.6207786Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:07.6208821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:07.6209291Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:07.6339682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:07.6340187Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:07.6343252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:07.6343760Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:07.6728710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:07.6729214Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:07.6732272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:07.6732764Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:07.6752745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:07.6753518Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:07.6756353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:07.6756855Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:07.7938812Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:48:07.8003467Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:48:07.8423681Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:07.8451620Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:48:08.2628743Z skip: Need at least 4 CUDA devices (2.110s) 2022-08-17T13:48:08.2651263Z test_init_from_local_tensor_errors (__main__.TestShardedTensorFromLocalTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91941 2022-08-17T13:48:08.2657358Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91942 2022-08-17T13:48:08.2663883Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 91943 2022-08-17T13:48:08.2670367Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 91944 2022-08-17T13:48:09.7363970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:09.7364967Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:09.7366135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:09.7367044Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:09.7428756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:09.7429641Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:09.7432599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:09.7433446Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:09.7523737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:09.7524663Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:09.7526963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:09.7527892Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:09.7592781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:09.7593736Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:09.7595478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:09.7596676Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:09.9043543Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:48:09.9102762Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:09.9220658Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:48:09.9241286Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:48:10.2728075Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T13:48:10.3245750Z test_serialize_and_deserialize (__main__.TestShardedTensorMetadata) ... ok (0.052s) 2022-08-17T13:48:10.3246243Z 2022-08-17T13:48:10.3247095Z ---------------------------------------------------------------------- 2022-08-17T13:48:10.3247469Z Ran 64 tests in 126.739s 2022-08-17T13:48:10.3247653Z 2022-08-17T13:48:10.3247751Z OK (skipped=62) 2022-08-17T13:48:10.3247909Z 2022-08-17T13:48:10.3248046Z Generating XML reports... 2022-08-17T13:48:10.3283768Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestCreateTensorFromParams-20220817134603.xml 2022-08-17T13:48:10.3286765Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorMetadata-20220817134603.xml 2022-08-17T13:48:10.3291424Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestLocalTensor-20220817134603.xml 2022-08-17T13:48:10.3296037Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestModuleHookApi-20220817134603.xml 2022-08-17T13:48:10.3301755Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardMetadata-20220817134603.xml 2022-08-17T13:48:10.3306609Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardParameter-20220817134603.xml 2022-08-17T13:48:10.3311379Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardTensor-20220817134603.xml 2022-08-17T13:48:10.3342096Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorChunked-20220817134603.xml 2022-08-17T13:48:10.3348421Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorCustomOps-20220817134603.xml 2022-08-17T13:48:10.3377884Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorEnumerable-20220817134603.xml 2022-08-17T13:48:10.3393803Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorFromLocalShards-20220817134603.xml 2022-08-17T13:48:10.3399228Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorFromLocalTensor-20220817134603.xml 2022-08-17T13:48:10.6918383Z Running distributed/test_c10d_pypg ... [2022-08-17 13:48:10.691338] 2022-08-17T13:48:10.6919130Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_pypg.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:48:10.691412] 2022-08-17T13:48:12.3101593Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_pypg 2022-08-17T13:48:12.3123860Z 2022-08-17T13:48:12.3124098Z Running tests... 2022-08-17T13:48:12.3124541Z ---------------------------------------------------------------------- 2022-08-17T13:48:12.3135135Z test_ddp_checkpointing_dynamic_module (__main__.TestDDPWithWorkSubclass) 2022-08-17T13:48:13.8391387Z Dynamic module can be checkpointed, multiple times, with non-reentrant ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:48:13.8582622Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92112 2022-08-17T13:48:15.2297085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:15.2297627Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:15.2298399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:15.2298875Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:15.3968868Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:16.6121846Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpftdc0mnn 2022-08-17T13:48:16.6123396Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpftdc0mnn/_remote_module_non_scriptable.py 2022-08-17T13:48:17.0326488Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:48:17.0327121Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:48:17.4681537Z ok (5.155s) 2022-08-17T13:48:17.4686607Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.TestDDPWithWorkSubclass) 2022-08-17T13:48:17.4699244Z Dynamic module can be checkpointed multiple times with weight sharing ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92149 2022-08-17T13:48:18.8878592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:18.8879079Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:18.8881154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:18.8881642Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:19.0636946Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:20.3061765Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcljdkvxb 2022-08-17T13:48:20.3063045Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcljdkvxb/_remote_module_non_scriptable.py 2022-08-17T13:48:20.7340424Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:48:20.7341033Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:48:21.1788318Z ok (3.711s) 2022-08-17T13:48:21.1795091Z test_ddp_checkpointing_once_use_reentrant_False (__main__.TestDDPWithWorkSubclass) 2022-08-17T13:48:21.1807596Z DDP works as expected when layer is checkpointed only once. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92186 2022-08-17T13:48:22.5719021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:22.5720081Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:22.5721319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:22.5722287Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:22.7491349Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:24.0076529Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7ni2ygys 2022-08-17T13:48:24.0077609Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7ni2ygys/_remote_module_non_scriptable.py 2022-08-17T13:48:24.4445453Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:48:24.4446456Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:48:24.4506632Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:24.4808853Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:24.4957186Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:48:24.4958871Z warnings.warn( 2022-08-17T13:48:24.5065082Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:24.5269132Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:24.5556720Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:24.5805737Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:24.9896564Z ok (3.811s) 2022-08-17T13:48:24.9903816Z test_ddp_checkpointing_once_use_reentrant_True (__main__.TestDDPWithWorkSubclass) 2022-08-17T13:48:24.9917020Z DDP works as expected when layer is checkpointed only once. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92223 2022-08-17T13:48:26.3651737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:26.3652284Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:26.3653199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:26.3653685Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:26.5389433Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:27.8009985Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvu5bcwls 2022-08-17T13:48:27.8010817Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvu5bcwls/_remote_module_non_scriptable.py 2022-08-17T13:48:28.2280631Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:48:28.2281266Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:48:28.2341096Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:28.2643716Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:28.2792071Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:48:28.2792924Z warnings.warn( 2022-08-17T13:48:28.2896825Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:28.3101064Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:28.3418103Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:28.3661994Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:28.7017614Z ok (3.712s) 2022-08-17T13:48:28.7022647Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.TestDDPWithWorkSubclass) 2022-08-17T13:48:28.7036073Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92260 2022-08-17T13:48:30.1267134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:30.1267640Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:30.1268695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:30.1269272Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:30.3017354Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:31.5711928Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0e3yekiv 2022-08-17T13:48:31.5712875Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0e3yekiv/_remote_module_non_scriptable.py 2022-08-17T13:48:31.9972989Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:48:32.0107077Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:48:32.0107550Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:32.0404041Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:32.4124250Z ok (3.711s) 2022-08-17T13:48:32.4129380Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.TestDDPWithWorkSubclass) 2022-08-17T13:48:32.4142692Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92297 2022-08-17T13:48:33.8219179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:33.8220179Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:33.8221349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:33.8222262Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:33.9966201Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:35.2294483Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdslujtnt 2022-08-17T13:48:35.2295638Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdslujtnt/_remote_module_non_scriptable.py 2022-08-17T13:48:35.6565729Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:48:35.6566350Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:48:35.6702526Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:35.7013543Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:36.1233544Z ok (3.711s) 2022-08-17T13:48:36.1241292Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.TestDDPWithWorkSubclass) 2022-08-17T13:48:36.1254336Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92334 2022-08-17T13:48:37.5394436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:37.5394956Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:37.5395910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:37.5396673Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:37.7128964Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:38.9445735Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp39wca1z8 2022-08-17T13:48:38.9446768Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp39wca1z8/_remote_module_non_scriptable.py 2022-08-17T13:48:39.3820004Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:48:39.3820604Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:48:39.3890520Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:39.4136710Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:48:39.4493193Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:39.8343510Z ok (3.711s) 2022-08-17T13:48:39.8351974Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.TestDDPWithWorkSubclass) 2022-08-17T13:48:39.8365005Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92371 2022-08-17T13:48:41.2148553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:41.2149084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:41.2149816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:41.2150282Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:41.3882343Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:42.6277924Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgxhfn4tq 2022-08-17T13:48:42.6279353Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgxhfn4tq/_remote_module_non_scriptable.py 2022-08-17T13:48:43.0559821Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:48:43.0560989Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:48:43.4451335Z ok (3.611s) 2022-08-17T13:48:43.4456073Z test_ddp_checkpointing_twice_weight_sharing (__main__.TestDDPWithWorkSubclass) 2022-08-17T13:48:43.4469103Z Checkpointing should work with static graph in the case of checkpointing ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92408 2022-08-17T13:48:44.8713626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:44.8714174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:44.8715148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:44.8715691Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:45.0476839Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:46.2819601Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy2k082kf 2022-08-17T13:48:46.2820743Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy2k082kf/_remote_module_non_scriptable.py 2022-08-17T13:48:46.7204969Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:48:46.7205575Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:48:46.7332850Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:46.7618958Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:47.1556933Z ok (3.710s) 2022-08-17T13:48:47.1564452Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.TestDDPWithWorkSubclass) 2022-08-17T13:48:47.1577346Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92445 2022-08-17T13:48:48.5430914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:48.5431406Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:48.5432444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:48.5433024Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:48.7198185Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:49.9768557Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprxtan7ws 2022-08-17T13:48:49.9769166Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprxtan7ws/_remote_module_non_scriptable.py 2022-08-17T13:48:50.4032485Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:48:50.4033071Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:48:50.4035654Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:48:50.4299757Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:48:50.4300518Z warnings.warn( 2022-08-17T13:48:50.4402843Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:50.4895733Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:50.8669456Z ok (3.711s) 2022-08-17T13:48:50.8677312Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.TestDDPWithWorkSubclass) 2022-08-17T13:48:50.8690722Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92482 2022-08-17T13:48:52.2831020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:52.2831550Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:52.2832932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:52.2833492Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:52.4562197Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:53.7041589Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8kv91cyk 2022-08-17T13:48:53.7042420Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8kv91cyk/_remote_module_non_scriptable.py 2022-08-17T13:48:54.1403499Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:48:54.1404412Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:48:54.1538698Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:48:54.1539447Z warnings.warn( 2022-08-17T13:48:54.1663645Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:54.2049297Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:54.5779011Z ok (3.711s) 2022-08-17T13:48:54.5788431Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.TestDDPWithWorkSubclass) 2022-08-17T13:48:54.5801645Z Test that checkpointing with weight sharing works. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92519 2022-08-17T13:48:55.9559970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:55.9560513Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:55.9561582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:55.9562058Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:56.1311719Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:48:57.3899447Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpixnscnx3 2022-08-17T13:48:57.3900081Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpixnscnx3/_remote_module_non_scriptable.py 2022-08-17T13:48:57.8161192Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:48:57.8161813Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:48:57.8220809Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:57.8565006Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:48:58.2888117Z ok (3.711s) 2022-08-17T13:48:58.2897082Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.TestDDPWithWorkSubclass) 2022-08-17T13:48:58.2909829Z Test that checkpointing with weight sharing works. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92556 2022-08-17T13:48:59.6732051Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:48:59.6733039Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:48:59.6734253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:48:59.6735172Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:48:59.8483391Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:01.0963339Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp37qfut06 2022-08-17T13:49:01.0964395Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp37qfut06/_remote_module_non_scriptable.py 2022-08-17T13:49:01.5433103Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:49:01.5433722Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:49:01.5497365Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:01.5788124Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:01.5980476Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:01.6256366Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:02.0010079Z ok (3.712s) 2022-08-17T13:49:02.0032123Z test_ddp_invoke_work_object (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92593 2022-08-17T13:49:03.4274180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:03.4274699Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:03.4275825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:03.4276307Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:03.6038241Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:03.6155501Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmped93pgfp 2022-08-17T13:49:03.6158238Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmped93pgfp/_remote_module_non_scriptable.py 2022-08-17T13:49:03.9084312Z ok (1.907s) 2022-08-17T13:49:03.9101304Z test_ddp_with_pypg (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92629 2022-08-17T13:49:05.3114341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:05.3114845Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:05.3116024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:05.3116540Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:05.4855681Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:05.4973907Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0zeyi7so 2022-08-17T13:49:05.4976935Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0zeyi7so/_remote_module_non_scriptable.py 2022-08-17T13:49:05.5181758Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:05.8148586Z ok (1.906s) 2022-08-17T13:49:05.8164723Z test_ddp_with_pypg_with_grad_views (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92665 2022-08-17T13:49:07.1810343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:07.1810876Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:07.1812279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:07.1813035Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:07.3559954Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:07.3680572Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgn_e_r5w 2022-08-17T13:49:07.3683992Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgn_e_r5w/_remote_module_non_scriptable.py 2022-08-17T13:49:07.3889849Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:07.6208066Z ok (1.806s) 2022-08-17T13:49:07.6228480Z test_invalid_powerSGD_state (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92701 2022-08-17T13:49:09.0242809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:09.0243685Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:09.0245022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:09.0245540Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:09.1998636Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:09.2004438Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:49:09.2005809Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:49:09.2006895Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:49:09.2008240Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:49:09.2009545Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:49:09.2010610Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:49:09.4272589Z ok (1.806s) 2022-08-17T13:49:09.4472290Z test_sync_batch_norm_empty_input (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92735 2022-08-17T13:49:10.8124085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:10.8125096Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:10.8126727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:10.8127667Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:10.9872787Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:12.2113410Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm4_dmsty 2022-08-17T13:49:12.2114519Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm4_dmsty/_remote_module_non_scriptable.py 2022-08-17T13:49:12.2273390Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:49:12.2273959Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:49:12.2302917Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:12.5547777Z ok (3.127s) 2022-08-17T13:49:12.5573452Z test_sync_batch_norm_only_empty_input (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92772 2022-08-17T13:49:13.9356502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:13.9357338Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:13.9358494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:13.9358979Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:14.1092893Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:15.3555756Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa8z65hsa 2022-08-17T13:49:15.3556925Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa8z65hsa/_remote_module_non_scriptable.py 2022-08-17T13:49:15.3718503Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:49:15.3719173Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:49:15.3739623Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:15.7650116Z ok (3.210s) 2022-08-17T13:49:15.7655105Z test_ddp_checkpointing_dynamic_module (__main__.TestDDPWithWorkWrapper) 2022-08-17T13:49:15.7669500Z Dynamic module can be checkpointed, multiple times, with non-reentrant ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92809 2022-08-17T13:49:17.1526531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:17.1527071Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:17.1528010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:17.1528485Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:17.3203069Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:18.5585565Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpajv09pdg 2022-08-17T13:49:18.5586187Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpajv09pdg/_remote_module_non_scriptable.py 2022-08-17T13:49:18.9746228Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:49:18.9746843Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:49:19.3754608Z ok (3.610s) 2022-08-17T13:49:19.3759498Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.TestDDPWithWorkWrapper) 2022-08-17T13:49:19.3772792Z Dynamic module can be checkpointed multiple times with weight sharing ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92846 2022-08-17T13:49:20.7949863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:20.7950376Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:20.7951398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:20.7951904Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:20.9684217Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:22.2260303Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8vkat6x3 2022-08-17T13:49:22.2261152Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8vkat6x3/_remote_module_non_scriptable.py 2022-08-17T13:49:22.6550465Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:49:22.6551465Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:49:23.0860052Z ok (3.710s) 2022-08-17T13:49:23.0866956Z test_ddp_checkpointing_once_use_reentrant_False (__main__.TestDDPWithWorkWrapper) 2022-08-17T13:49:23.0880812Z DDP works as expected when layer is checkpointed only once. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92883 2022-08-17T13:49:24.5043904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:24.5044408Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:24.5045654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:24.5046137Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:24.6778560Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:25.9335586Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqxd1v3oy 2022-08-17T13:49:25.9336392Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqxd1v3oy/_remote_module_non_scriptable.py 2022-08-17T13:49:26.3590864Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:49:26.3591473Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:49:26.3650703Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:26.3949282Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:26.4095127Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:49:26.4095886Z warnings.warn( 2022-08-17T13:49:26.4200572Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:26.4402755Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:26.4687698Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:26.4931502Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:26.8970372Z ok (3.811s) 2022-08-17T13:49:26.8978412Z test_ddp_checkpointing_once_use_reentrant_True (__main__.TestDDPWithWorkWrapper) 2022-08-17T13:49:26.8993892Z DDP works as expected when layer is checkpointed only once. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92920 2022-08-17T13:49:28.3073019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:28.3073509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:28.3074525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:28.3075046Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:28.4782336Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:29.7113338Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxe29lmp9 2022-08-17T13:49:29.7113966Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxe29lmp9/_remote_module_non_scriptable.py 2022-08-17T13:49:30.1298639Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:49:30.1299239Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:49:30.1359345Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:30.1791500Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:30.1936972Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:49:30.1937722Z warnings.warn( 2022-08-17T13:49:30.2040323Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:30.2241848Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:30.2527791Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:30.2770545Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:30.6080781Z ok (3.711s) 2022-08-17T13:49:30.6085258Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.TestDDPWithWorkWrapper) 2022-08-17T13:49:30.6099128Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92957 2022-08-17T13:49:32.0075058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:32.0075572Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:32.0076381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:32.0076842Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:32.1767251Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:33.3986943Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe2h5vfeo 2022-08-17T13:49:33.3991369Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe2h5vfeo/_remote_module_non_scriptable.py 2022-08-17T13:49:33.8184884Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:49:33.8185526Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:49:33.8314862Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:33.8597741Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:34.2183440Z ok (3.610s) 2022-08-17T13:49:34.2188698Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.TestDDPWithWorkWrapper) 2022-08-17T13:49:34.2202210Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92994 2022-08-17T13:49:35.6365050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:35.6365556Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:35.6367172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:35.6367724Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:35.8110905Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:37.0474303Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnwfriyi3 2022-08-17T13:49:37.0475227Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnwfriyi3/_remote_module_non_scriptable.py 2022-08-17T13:49:37.4736164Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:49:37.4736796Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:49:37.4874162Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:37.5183648Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:37.9287233Z ok (3.710s) 2022-08-17T13:49:37.9295192Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.TestDDPWithWorkWrapper) 2022-08-17T13:49:37.9320669Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93031 2022-08-17T13:49:39.3581120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:39.3581637Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:39.3583084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:39.3583985Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:39.5325455Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:40.7841968Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4y27jfvu 2022-08-17T13:49:40.7842880Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4y27jfvu/_remote_module_non_scriptable.py 2022-08-17T13:49:41.2192524Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:49:41.2193129Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:49:41.2262086Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:41.2508419Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:49:41.2863501Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:41.7409390Z ok (3.812s) 2022-08-17T13:49:41.7416939Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.TestDDPWithWorkWrapper) 2022-08-17T13:49:41.7431286Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93068 2022-08-17T13:49:43.1392751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:43.1393268Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:43.1394434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:43.1395226Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:43.3165202Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:44.5580459Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl411caf_ 2022-08-17T13:49:44.5581457Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl411caf_/_remote_module_non_scriptable.py 2022-08-17T13:49:44.9867401Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:49:44.9868005Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:49:45.4518588Z ok (3.711s) 2022-08-17T13:49:45.4523852Z test_ddp_checkpointing_twice_weight_sharing (__main__.TestDDPWithWorkWrapper) 2022-08-17T13:49:45.4538838Z Checkpointing should work with static graph in the case of checkpointing ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93105 2022-08-17T13:49:46.8616075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:46.8616579Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:46.8617829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:46.8618316Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:47.0349201Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:48.2714383Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxc0pymhj 2022-08-17T13:49:48.2715299Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxc0pymhj/_remote_module_non_scriptable.py 2022-08-17T13:49:48.7027423Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:49:48.7028057Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:49:48.7153745Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:48.7435197Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:49.1624736Z ok (3.711s) 2022-08-17T13:49:49.1633584Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.TestDDPWithWorkWrapper) 2022-08-17T13:49:49.1648649Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93142 2022-08-17T13:49:50.6114609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:50.6115164Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:50.6116681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:50.6117190Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:50.7875327Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:52.0320394Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnwgarwfq 2022-08-17T13:49:52.0321554Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnwgarwfq/_remote_module_non_scriptable.py 2022-08-17T13:49:52.4527521Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:49:52.4528111Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:49:52.4530174Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-08-17T13:49:52.4796872Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:49:52.4797621Z warnings.warn( 2022-08-17T13:49:52.4900736Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:52.5402505Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:52.8734718Z ok (3.711s) 2022-08-17T13:49:52.8742668Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.TestDDPWithWorkWrapper) 2022-08-17T13:49:52.8757124Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93179 2022-08-17T13:49:54.2649345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:54.2649853Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:54.2651059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:54.2651570Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:54.4394213Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:55.6829439Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpva9trpv6 2022-08-17T13:49:55.6830422Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpva9trpv6/_remote_module_non_scriptable.py 2022-08-17T13:49:56.1207443Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:49:56.1208049Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:49:56.1343131Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1747: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-08-17T13:49:56.1344623Z warnings.warn( 2022-08-17T13:49:56.1470853Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:56.1861664Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:56.5847736Z ok (3.711s) 2022-08-17T13:49:56.5857099Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.TestDDPWithWorkWrapper) 2022-08-17T13:49:56.5871019Z Test that checkpointing with weight sharing works. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93216 2022-08-17T13:49:57.9656556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:49:57.9657056Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:49:57.9658447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:49:57.9659225Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:49:58.1397301Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:49:59.3920020Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyjrro6i6 2022-08-17T13:49:59.3920971Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyjrro6i6/_remote_module_non_scriptable.py 2022-08-17T13:49:59.8215139Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:49:59.8216206Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:49:59.8275253Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:49:59.8613194Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:50:00.2959106Z ok (3.711s) 2022-08-17T13:50:00.2968032Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.TestDDPWithWorkWrapper) 2022-08-17T13:50:00.2982117Z Test that checkpointing with weight sharing works. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93253 2022-08-17T13:50:01.7279661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:01.7280269Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:01.7281333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:01.7281910Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:01.9036170Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:50:03.1635877Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0rmbtrh1 2022-08-17T13:50:03.1637117Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0rmbtrh1/_remote_module_non_scriptable.py 2022-08-17T13:50:03.6064772Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:50:03.6065432Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:50:03.6128728Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:50:03.6414583Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:50:03.6607906Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:50:03.6882068Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:50:04.1073832Z ok (3.811s) 2022-08-17T13:50:04.1097172Z test_ddp_invoke_work_object (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93290 2022-08-17T13:50:05.4999997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:05.5000527Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:05.5001552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:05.5002096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:05.6690988Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:50:05.6804848Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0nrcjtlg 2022-08-17T13:50:05.6807041Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0nrcjtlg/_remote_module_non_scriptable.py 2022-08-17T13:50:05.9141584Z ok (1.807s) 2022-08-17T13:50:05.9161718Z test_ddp_with_pypg (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93326 2022-08-17T13:50:07.3323467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:07.3323984Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:07.3325410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:07.3326097Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:07.5120849Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:50:07.5242024Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa98zp4t1 2022-08-17T13:50:07.5244877Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa98zp4t1/_remote_module_non_scriptable.py 2022-08-17T13:50:07.5454333Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:50:07.8208428Z ok (1.907s) 2022-08-17T13:50:07.8225213Z test_ddp_with_pypg_with_grad_views (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93362 2022-08-17T13:50:09.2269477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:09.2269975Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:09.2271075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:09.2271562Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:09.3997805Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:50:09.4116426Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbkecc8g_ 2022-08-17T13:50:09.4119351Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbkecc8g_/_remote_module_non_scriptable.py 2022-08-17T13:50:09.4324699Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:50:09.7271553Z ok (1.906s) 2022-08-17T13:50:09.7290321Z test_invalid_powerSGD_state (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93398 2022-08-17T13:50:11.0946964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:11.0947465Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:11.0949309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:11.0949789Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:11.2696564Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:50:11.2703291Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:50:11.2705270Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:50:11.2706761Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:50:11.2707968Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:50:11.2709023Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:50:11.2710083Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-08-17T13:50:11.5339462Z ok (1.807s) 2022-08-17T13:50:11.5365772Z test_sync_batch_norm_empty_input (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93432 2022-08-17T13:50:12.9387162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:12.9387664Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:12.9388848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:12.9389337Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:13.1130145Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:50:14.3640086Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmped5voc10 2022-08-17T13:50:14.3640688Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmped5voc10/_remote_module_non_scriptable.py 2022-08-17T13:50:14.3806641Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:50:14.3807203Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:50:14.3837721Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:50:14.7453700Z ok (3.211s) 2022-08-17T13:50:14.7481231Z test_sync_batch_norm_only_empty_input (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93469 2022-08-17T13:50:16.2059477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:16.2060243Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:16.2061216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:16.2061678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:16.3791295Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:50:17.6277633Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8k0zg112 2022-08-17T13:50:17.6278439Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8k0zg112/_remote_module_non_scriptable.py 2022-08-17T13:50:17.6439324Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:50:17.6440125Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:50:17.6461349Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:50:17.9556125Z ok (3.210s) 2022-08-17T13:50:17.9556313Z 2022-08-17T13:50:17.9558564Z ---------------------------------------------------------------------- 2022-08-17T13:50:17.9558921Z Ran 38 tests in 125.643s 2022-08-17T13:50:17.9559093Z 2022-08-17T13:50:17.9559193Z OK 2022-08-17T13:50:17.9559329Z 2022-08-17T13:50:17.9559460Z Generating XML reports... 2022-08-17T13:50:17.9614448Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_pypg/TEST-TestDDPWithWorkSubclass-20220817134812.xml 2022-08-17T13:50:17.9636761Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_pypg/TEST-TestDDPWithWorkWrapper-20220817134812.xml 2022-08-17T13:50:18.3151939Z Running distributed/fsdp/test_wrap ... [2022-08-17 13:50:18.314736] 2022-08-17T13:50:18.3152671Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_wrap.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:50:18.314809] 2022-08-17T13:50:19.9616578Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_wrap 2022-08-17T13:50:19.9640191Z 2022-08-17T13:50:19.9640419Z Running tests... 2022-08-17T13:50:19.9640992Z ---------------------------------------------------------------------- 2022-08-17T13:50:19.9648179Z test_always_wrap (__main__.TestAutoWrap) 2022-08-17T13:50:21.4989117Z Test to ensure that if `always_wrap_policy` is ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:50:21.5205865Z ok (1.556s) 2022-08-17T13:50:21.5238941Z test_always_wrap_with_ignored_modules_wrap_method_WrapMethod_FSDP_CTOR (__main__.TestAutoWrap) ... /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:50:21.5239857Z warnings.warn( 2022-08-17T13:50:21.5258610Z ok (0.005s) 2022-08-17T13:50:21.5282499Z test_always_wrap_with_ignored_modules_wrap_method_WrapMethod_WRAP_API (__main__.TestAutoWrap) ... [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.5283920Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.5285482Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.5304763Z ok (0.004s) 2022-08-17T13:50:21.5311005Z test_auto_wrap_api (__main__.TestAutoWrap) 2022-08-17T13:50:21.5334076Z Test to ensure with auto wrap, we wrap child modules correctly based on the min_num_params. ... ok (0.003s) 2022-08-17T13:50:21.5342001Z test_auto_wrap_preset_exclude_wrap (__main__.TestAutoWrap) 2022-08-17T13:50:21.5355920Z Test to ensure excluded modules are not wrapped, regardless if the total param size is greater than the ... ok (0.002s) 2022-08-17T13:50:21.5362548Z test_auto_wrap_preset_exclude_wrap_include_children (__main__.TestAutoWrap) 2022-08-17T13:50:21.5372905Z Test to ensure excluded modules are not wrapped, but children are if param size is greater than ... [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.5374296Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.5375581Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.5379492Z ok (0.002s) 2022-08-17T13:50:21.5388023Z test_auto_wrap_preset_force_leaf (__main__.TestAutoWrap) 2022-08-17T13:50:21.5414389Z Test to ensure force-leaf modules are not wrapped, and children are not wrapped. The ... ok (0.003s) 2022-08-17T13:50:21.5422896Z test_auto_wrap_preset_force_leaf_custom (__main__.TestAutoWrap) 2022-08-17T13:50:21.5437306Z Test to ensure force-leaf modules are not wrapped. ... ok (0.002s) 2022-08-17T13:50:21.5467935Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_AFTER_cpu_offload_CPUOffload(offload_params=False)_use_device_id_False (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:50:21.5468832Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:50:21.5506176Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:50:21.5506744Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:50:21.5510211Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.5511499Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.5512921Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.9615915Z ok (0.418s) 2022-08-17T13:50:21.9644552Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_AFTER_cpu_offload_CPUOffload(offload_params=False)_use_device_id_True (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:50:21.9645546Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:50:21.9701919Z ok (0.009s) 2022-08-17T13:50:21.9722827Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_AFTER_cpu_offload_CPUOffload(offload_params=True)_use_device_id_False (__main__.TestAutoWrap) ... ok (0.002s) 2022-08-17T13:50:21.9743210Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_AFTER_cpu_offload_CPUOffload(offload_params=True)_use_device_id_True (__main__.TestAutoWrap) ... ok (0.002s) 2022-08-17T13:50:21.9769391Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_BEFORE_cpu_offload_CPUOffload(offload_params=False)_use_device_id_False (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:50:21.9770302Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:50:21.9823056Z ok (0.008s) 2022-08-17T13:50:21.9849339Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_BEFORE_cpu_offload_CPUOffload(offload_params=False)_use_device_id_True (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:50:21.9850237Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:50:21.9873596Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.9875141Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.9876607Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.9877913Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.9879408Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.9880815Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.9882098Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.9883349Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.9884709Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.9885960Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.9887200Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.9888458Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.9889707Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.9890952Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:21.9923310Z ok (0.010s) 2022-08-17T13:50:21.9949387Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_BEFORE_cpu_offload_CPUOffload(offload_params=True)_use_device_id_False (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:50:21.9950270Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:50:22.0036545Z ok (0.011s) 2022-08-17T13:50:22.0062043Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_BEFORE_cpu_offload_CPUOffload(offload_params=True)_use_device_id_True (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:50:22.0062907Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-08-17T13:50:22.0137360Z ok (0.010s) 2022-08-17T13:50:22.0162943Z test_auto_wrap_with_ignored_modules_wrap_method_WrapMethod_FSDP_CTOR (__main__.TestAutoWrap) ... [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:22.0164365Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:22.0165653Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:22.0167010Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:22.0173922Z ok (0.004s) 2022-08-17T13:50:22.0204415Z test_auto_wrap_with_ignored_modules_wrap_method_WrapMethod_WRAP_API (__main__.TestAutoWrap) ... ok (0.003s) 2022-08-17T13:50:22.0213425Z test_transformer_auto_wrap_policy (__main__.TestAutoWrap) 2022-08-17T13:50:22.0241874Z Tests the ``transformer_auto_wrap_policy``. ... [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:22.0401078Z ok (0.020s) 2022-08-17T13:50:22.0420621Z test_wrap_disabled_outside_context (__main__.TestAutoWrap) ... ok (0.002s) 2022-08-17T13:50:22.0442125Z test_wrap_override_defaults (__main__.TestAutoWrap) ... [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:22.0443508Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:50:22.0445017Z ok (0.002s) 2022-08-17T13:50:22.0466098Z test_wrap_wrap_method_WrapMethod_FSDP_CTOR (__main__.TestAutoWrap) ... ok (0.002s) 2022-08-17T13:50:22.0486504Z test_wrap_wrap_method_WrapMethod_WRAP_API (__main__.TestAutoWrap) ... ok (0.002s) 2022-08-17T13:50:22.0499276Z test_bn_always_wrapped_individually (__main__.TestFSDPWrap) 2022-08-17T13:50:22.0528889Z Ensures that by using _or_policy with _wrap_batchnorm_individually, even ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93549 2022-08-17T13:50:22.0535317Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 93550 2022-08-17T13:50:23.5034651Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:23.5035141Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:23.5037814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:23.5038313Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:23.5405233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:23.5405702Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:23.5410075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:23.5410552Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:23.6723770Z dist init r=0, world=2 2022-08-17T13:50:23.6727153Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:50:23.7146421Z dist init r=1, world=2 2022-08-17T13:50:23.7151694Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:50:23.7152736Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:23.7237067Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:25.1116156Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:50:25.1116692Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:50:25.1378880Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:50:25.1379783Z warnings.warn( 2022-08-17T13:50:25.1380923Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:50:25.1381655Z warnings.warn( 2022-08-17T13:50:25.5626561Z ok (3.514s) 2022-08-17T13:50:25.5631764Z test_error_already_wrapped_nested_False_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) 2022-08-17T13:50:25.5645356Z Test that an error is raised if we attempt to wrap when submodules are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93628 2022-08-17T13:50:25.5651313Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 93629 2022-08-17T13:50:27.0279084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:27.0279587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:27.0282060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:27.0282526Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:27.0709684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:27.0710155Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:27.0714732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:27.0715198Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:27.1971928Z dist init r=0, world=2 2022-08-17T13:50:27.1975416Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:50:27.2464358Z dist init r=1, world=2 2022-08-17T13:50:27.2469857Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:50:27.2470691Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:27.2485473Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:28.6233666Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:50:28.6234246Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:50:28.6446156Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:50:28.6447313Z warnings.warn( 2022-08-17T13:50:28.6448487Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:50:28.6449295Z warnings.warn( 2022-08-17T13:50:29.0740782Z ok (3.511s) 2022-08-17T13:50:29.0746546Z test_error_already_wrapped_nested_False_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) 2022-08-17T13:50:29.0759361Z Test that an error is raised if we attempt to wrap when submodules are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93707 2022-08-17T13:50:29.0765191Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 93708 2022-08-17T13:50:30.4886759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:30.4887269Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:30.4889889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:30.4890408Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:30.5223141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:30.5223950Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:30.5228743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:30.5229241Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:30.6542838Z dist init r=1, world=2 2022-08-17T13:50:30.6546743Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:50:30.6960668Z dist init r=0, world=2 2022-08-17T13:50:30.6965961Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:50:30.6967110Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:30.7056874Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:32.1033059Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:50:32.1033945Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:50:32.5852238Z ok (3.511s) 2022-08-17T13:50:32.5858672Z test_error_already_wrapped_nested_True_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) 2022-08-17T13:50:32.5872557Z Test that an error is raised if we attempt to wrap when submodules are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93786 2022-08-17T13:50:32.5879066Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 93787 2022-08-17T13:50:34.0271047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:34.0271549Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:34.0273899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:34.0274370Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:34.0382310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:34.0382784Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:34.0387935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:34.0388417Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:34.1945241Z dist init r=0, world=2 2022-08-17T13:50:34.1948748Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:50:34.2122032Z dist init r=1, world=2 2022-08-17T13:50:34.2127214Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:50:34.2128348Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:34.2153470Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:35.5950351Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:50:35.5950869Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:50:35.6165562Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:50:35.6166340Z warnings.warn( 2022-08-17T13:50:35.6167452Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:50:35.6168206Z warnings.warn( 2022-08-17T13:50:36.0967214Z ok (3.511s) 2022-08-17T13:50:36.0972487Z test_error_already_wrapped_nested_True_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) 2022-08-17T13:50:36.0985766Z Test that an error is raised if we attempt to wrap when submodules are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93865 2022-08-17T13:50:36.0991989Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 93866 2022-08-17T13:50:37.5299728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:37.5300254Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:37.5303272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:37.5303974Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:37.5597533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:37.5598088Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:37.5598720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:37.5599192Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:37.6966672Z dist init r=1, world=2 2022-08-17T13:50:37.6970340Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:50:37.7356584Z dist init r=0, world=2 2022-08-17T13:50:37.7361952Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:50:37.7363018Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:37.7378442Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:39.1249174Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:50:39.1249680Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:50:39.6080734Z ok (3.511s) 2022-08-17T13:50:39.6117413Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 93944 2022-08-17T13:50:39.6123377Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 93945 2022-08-17T13:50:41.0210554Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:41.0211055Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:41.0214637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:41.0215133Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:41.0603127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:41.0603576Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:41.0607501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:41.0608002Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:41.1962316Z dist init r=1, world=2 2022-08-17T13:50:41.1966771Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:50:41.2284666Z dist init r=0, world=2 2022-08-17T13:50:41.2289442Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:50:41.2290267Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:41.2375733Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:42.6181359Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:50:42.6181894Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:50:42.6413367Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:50:42.6414182Z warnings.warn( 2022-08-17T13:50:42.6415305Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:50:42.6416058Z warnings.warn( 2022-08-17T13:50:42.6475814Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:50:42.6476372Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:50:42.6480157Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:50:42.6480715Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:50:43.5223447Z ok (3.914s) 2022-08-17T13:50:43.5260497Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94027 2022-08-17T13:50:43.5266651Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94028 2022-08-17T13:50:44.9742016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:44.9742561Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:44.9744564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:44.9745056Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:45.0499556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:45.0500047Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:45.0503361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:45.0504454Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:45.1411707Z dist init r=1, world=2 2022-08-17T13:50:45.1415823Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:50:45.2232240Z dist init r=0, world=2 2022-08-17T13:50:45.2237266Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:50:45.2238755Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:45.2332977Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:46.5994826Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:50:46.5995337Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:50:46.6273057Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:50:46.6273648Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:50:46.6274641Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:50:46.6275188Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:50:47.5366706Z ok (4.014s) 2022-08-17T13:50:47.5403969Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94110 2022-08-17T13:50:47.5409862Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94111 2022-08-17T13:50:48.9802496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:48.9803444Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:48.9805475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:48.9805978Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:48.9820893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:48.9821357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:48.9826777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:48.9827241Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:49.1493832Z dist init r=1, world=2 2022-08-17T13:50:49.1497411Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:50:49.1525373Z dist init r=0, world=2 2022-08-17T13:50:49.1529916Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:50:49.1530855Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:49.1601110Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:50.5235675Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:50:50.5236209Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:50:50.5450780Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:50:50.5451601Z warnings.warn( 2022-08-17T13:50:50.5452689Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:50:50.5453449Z warnings.warn( 2022-08-17T13:50:50.5513054Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:50:50.5513622Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:50:50.5514313Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:50:50.5515166Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:50:51.4519480Z ok (3.915s) 2022-08-17T13:50:51.4556981Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94193 2022-08-17T13:50:51.4562600Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94194 2022-08-17T13:50:52.9025577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:52.9026507Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:52.9028574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:52.9029497Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:52.9534914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:52.9535917Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:52.9539298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:52.9540283Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:53.0703972Z dist init r=1, world=2 2022-08-17T13:50:53.0708923Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:50:53.1201611Z dist init r=0, world=2 2022-08-17T13:50:53.1206371Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:50:53.1207157Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:53.1218983Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:54.4814363Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:50:54.4815375Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:50:54.5068601Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:50:54.5069512Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:50:54.5070574Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:50:54.5071528Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:50:55.4661371Z ok (4.014s) 2022-08-17T13:50:55.4698541Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94276 2022-08-17T13:50:55.4704649Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94277 2022-08-17T13:50:56.9127249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:56.9127749Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:56.9130555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:56.9131057Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:56.9291393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:50:56.9291908Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:50:56.9296240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:50:56.9296720Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:50:57.0807830Z dist init r=0, world=2 2022-08-17T13:50:57.0811169Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:50:57.1369034Z dist init r=1, world=2 2022-08-17T13:50:57.1373918Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:50:57.1374951Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:57.1422882Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:50:58.5008006Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:50:58.5008527Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:50:58.5212754Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:50:58.5213583Z warnings.warn( 2022-08-17T13:50:58.5214700Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:50:58.5215440Z warnings.warn( 2022-08-17T13:50:58.5274985Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:50:58.5275547Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:50:58.5276220Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:50:58.5276770Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:50:59.4803240Z ok (4.014s) 2022-08-17T13:50:59.4840340Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94359 2022-08-17T13:50:59.4846325Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94360 2022-08-17T13:51:00.9131891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:00.9132413Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:00.9134547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:00.9135029Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:00.9774209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:00.9774896Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:00.9778956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:00.9779433Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:01.0816822Z dist init r=0, world=2 2022-08-17T13:51:01.0820824Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:51:01.1481855Z dist init r=1, world=2 2022-08-17T13:51:01.1486250Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:51:01.1487208Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:01.1534226Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:02.5143000Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:51:02.5143521Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:51:02.5473036Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:02.5473622Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:02.5474330Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:02.5474870Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:03.4948564Z ok (4.014s) 2022-08-17T13:51:03.4985580Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94442 2022-08-17T13:51:03.4991499Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94443 2022-08-17T13:51:04.9678928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:04.9679436Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:04.9682218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:04.9682701Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:05.0453545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:05.0454029Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:05.0458422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:05.0458911Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:05.1339109Z dist init r=0, world=2 2022-08-17T13:51:05.1343072Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:51:05.2174010Z dist init r=1, world=2 2022-08-17T13:51:05.2178462Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:51:05.2179338Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:05.2259410Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:06.5755646Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:51:06.5756465Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:51:06.5973557Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:51:06.5974421Z warnings.warn( 2022-08-17T13:51:06.5975530Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:51:06.5976444Z warnings.warn( 2022-08-17T13:51:06.6035208Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:06.6035794Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:06.6038023Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:06.6038581Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:07.5102509Z ok (4.015s) 2022-08-17T13:51:07.5139818Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94525 2022-08-17T13:51:07.5145844Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94526 2022-08-17T13:51:08.9778549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:08.9779043Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:08.9781447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:08.9781939Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:09.0003802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:09.0004267Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:09.0009035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:09.0009509Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:09.1489691Z dist init r=0, world=2 2022-08-17T13:51:09.1492860Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:51:09.2078485Z dist init r=1, world=2 2022-08-17T13:51:09.2083300Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:51:09.2084092Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:09.2104269Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:10.5724444Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:51:10.5725425Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:51:10.6029286Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:10.6030346Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:10.6031575Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:10.6032492Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:11.5249392Z ok (4.015s) 2022-08-17T13:51:11.5286532Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94608 2022-08-17T13:51:11.5292395Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94609 2022-08-17T13:51:12.9633378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:12.9633894Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:12.9636557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:12.9637050Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:12.9660192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:12.9660656Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:12.9665189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:12.9665685Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:13.1308208Z dist init r=1, world=2 2022-08-17T13:51:13.1312249Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:51:13.1342456Z dist init r=0, world=2 2022-08-17T13:51:13.1347502Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:51:13.1348779Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:13.1416342Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:14.4926246Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:51:14.4926775Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:51:14.9377291Z ok (3.413s) 2022-08-17T13:51:14.9414533Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94687 2022-08-17T13:51:14.9420126Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94688 2022-08-17T13:51:16.3776037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:16.3776545Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:16.3779224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:16.3779725Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:16.4042008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:16.4042476Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:16.4046866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:16.4047378Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:16.5447190Z dist init r=0, world=2 2022-08-17T13:51:16.5451162Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:51:16.5689705Z dist init r=1, world=2 2022-08-17T13:51:16.5694013Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:51:16.5695107Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:16.5757998Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:17.9362965Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:51:17.9363512Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:51:17.9636232Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:17.9636818Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:17.9638001Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:17.9638557Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:18.8517858Z ok (3.914s) 2022-08-17T13:51:18.8554870Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94770 2022-08-17T13:51:18.8560627Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94771 2022-08-17T13:51:20.3127705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:20.3128227Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:20.3130883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:20.3131383Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:20.3338722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:20.3339216Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:20.3343693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:20.3344190Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:20.4802191Z dist init r=1, world=2 2022-08-17T13:51:20.4806089Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:51:20.5074840Z dist init r=0, world=2 2022-08-17T13:51:20.5079698Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:51:20.5080812Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:20.5112783Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:21.8797523Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:51:21.8798052Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:51:22.2649116Z ok (3.413s) 2022-08-17T13:51:22.2687925Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94849 2022-08-17T13:51:22.2693661Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94850 2022-08-17T13:51:23.6975549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:23.6976064Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:23.6978732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:23.6979505Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:23.7310147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:23.7310614Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:23.7315262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:23.7315748Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:23.8637178Z dist init r=0, world=2 2022-08-17T13:51:23.8641573Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:51:23.9021651Z dist init r=1, world=2 2022-08-17T13:51:23.9026303Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:51:23.9027471Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:23.9048713Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:25.2652466Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:51:25.2653447Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:51:25.2958675Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:25.2959676Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:25.2960947Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:25.2961948Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:26.2794934Z ok (4.015s) 2022-08-17T13:51:26.2832747Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 94932 2022-08-17T13:51:26.2838693Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 94933 2022-08-17T13:51:27.7645790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:27.7646307Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:27.7648811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:27.7649282Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:27.8040631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:27.8041322Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:27.8045698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:27.8046167Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:27.9320193Z dist init r=0, world=2 2022-08-17T13:51:27.9324230Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:51:27.9764869Z dist init r=1, world=2 2022-08-17T13:51:27.9769749Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:51:27.9770792Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:27.9834322Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:29.3428746Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:51:29.3429274Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:51:29.7925130Z ok (3.513s) 2022-08-17T13:51:29.7962792Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95011 2022-08-17T13:51:29.7968629Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95012 2022-08-17T13:51:31.2564353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:31.2564873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:31.2567536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:31.2568023Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:31.2700815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:31.2701278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:31.2705820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:31.2706278Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:31.4242471Z dist init r=1, world=2 2022-08-17T13:51:31.4246970Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:51:31.4422408Z dist init r=0, world=2 2022-08-17T13:51:31.4427785Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:51:31.4428590Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:31.4451650Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:32.8240878Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:51:32.8241402Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:51:32.8518824Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:32.8519403Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:32.8551130Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:32.8551999Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:33.8070678Z ok (4.014s) 2022-08-17T13:51:33.8113500Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95094 2022-08-17T13:51:33.8119041Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95095 2022-08-17T13:51:35.2474580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:35.2475100Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:35.2477190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:35.2477932Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:35.2773485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:35.2773962Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:35.2777839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:35.2778327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:35.4144689Z dist init r=1, world=2 2022-08-17T13:51:35.4148940Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:51:35.4492329Z dist init r=0, world=2 2022-08-17T13:51:35.4497623Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:51:35.4498376Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:35.4557589Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:36.8234494Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:51:36.8235014Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:51:37.2205864Z ok (3.413s) 2022-08-17T13:51:37.2243381Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95173 2022-08-17T13:51:37.2249355Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95174 2022-08-17T13:51:38.7074742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:38.7075256Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:38.7077681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:38.7078175Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:38.7249896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:38.7250357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:38.7254739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:38.7255226Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:38.8741804Z dist init r=1, world=2 2022-08-17T13:51:38.8746398Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:51:38.8982363Z dist init r=0, world=2 2022-08-17T13:51:38.8987463Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:51:38.8988241Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:38.9053127Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:40.2607233Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:51:40.2607757Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:51:40.2879458Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:40.2880367Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:40.2881092Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:40.2881650Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:41.2347910Z ok (4.014s) 2022-08-17T13:51:41.2367826Z test_wrap_batchnorm_individually_use_or_policy_False (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95256 2022-08-17T13:51:41.2374080Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95257 2022-08-17T13:51:42.7010957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:42.7013653Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:42.7014326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:42.7014825Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:42.7311525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:42.7311981Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:42.7316092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:42.7316580Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:42.8744438Z dist init r=0, world=2 2022-08-17T13:51:42.8748659Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:51:42.8979046Z dist init r=1, world=2 2022-08-17T13:51:42.8983460Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:51:42.8984490Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:42.9055947Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:44.2824701Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:51:44.2825230Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:51:44.3074560Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:51:44.3075437Z warnings.warn( 2022-08-17T13:51:44.3076845Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:51:44.3077615Z warnings.warn( 2022-08-17T13:51:44.7461899Z ok (3.511s) 2022-08-17T13:51:44.7482417Z test_wrap_batchnorm_individually_use_or_policy_True (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95335 2022-08-17T13:51:44.7488340Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95336 2022-08-17T13:51:46.1599640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:46.1600374Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:46.1602676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:46.1603161Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:46.1929040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:46.1929492Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:46.1933865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:46.1934338Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:46.3268059Z dist init r=1, world=2 2022-08-17T13:51:46.3272088Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:51:46.3658664Z dist init r=0, world=2 2022-08-17T13:51:46.3663467Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:51:46.3664518Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:46.3680407Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:47.7430800Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:51:47.7431331Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:51:47.7658703Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:51:47.7659528Z warnings.warn( 2022-08-17T13:51:47.7660653Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:51:47.7661399Z warnings.warn( 2022-08-17T13:51:48.1573864Z ok (3.411s) 2022-08-17T13:51:48.1574196Z 2022-08-17T13:51:48.1574648Z ---------------------------------------------------------------------- 2022-08-17T13:51:48.1575166Z Ran 46 tests in 88.193s 2022-08-17T13:51:48.1575374Z 2022-08-17T13:51:48.1575470Z OK 2022-08-17T13:51:48.1575605Z 2022-08-17T13:51:48.1575741Z Generating XML reports... 2022-08-17T13:51:48.1582896Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:51:48.1584637Z [W python_variable.cpp:232] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function concrete_decref_fn) 2022-08-17T13:51:48.1637219Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_wrap/TEST-TestAutoWrap-20220817135019.xml 2022-08-17T13:51:48.1662957Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_wrap/TEST-TestFSDPWrap-20220817135019.xml 2022-08-17T13:51:48.5373060Z Running distributed/fsdp/test_fsdp_clip_grad_norm ... [2022-08-17 13:51:48.536810] 2022-08-17T13:51:48.5373836Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_clip_grad_norm.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:51:48.536882] 2022-08-17T13:51:50.1107721Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_clip_grad_norm 2022-08-17T13:51:50.1125441Z 2022-08-17T13:51:50.1125852Z Running tests... 2022-08-17T13:51:50.1126326Z ---------------------------------------------------------------------- 2022-08-17T13:51:50.1134752Z test_fsdp_calc_grad_norm_norm_type_1_3_nested_fsdp_False (__main__.TestCalcuGradNorm) 2022-08-17T13:51:51.6173975Z Test grad norm cal API. ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:51:51.6352900Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95449 2022-08-17T13:51:51.6359298Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95450 2022-08-17T13:51:53.0469773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:53.0470294Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:53.0472341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:53.0472850Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:53.0709751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:53.0710299Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:53.0714596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:53.0715111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:53.2147093Z dist init r=1, world=2 2022-08-17T13:51:53.2151066Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:51:53.2423125Z dist init r=0, world=2 2022-08-17T13:51:53.2427859Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:51:53.2428772Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:53.2457302Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:54.6341381Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:51:54.6341906Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:51:54.6580723Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:54.6581933Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:54.6582666Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:54.6583188Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:55.5457284Z ok (5.433s) 2022-08-17T13:51:55.5463030Z test_fsdp_calc_grad_norm_norm_type_1_3_nested_fsdp_True (__main__.TestCalcuGradNorm) 2022-08-17T13:51:55.5476712Z Test grad norm cal API. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95532 2022-08-17T13:51:55.5482411Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95533 2022-08-17T13:51:56.9774327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:56.9774816Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:56.9777675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:56.9778153Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:56.9891324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:51:56.9891785Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:51:56.9895583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:51:56.9896044Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:51:57.1453685Z dist init r=0, world=2 2022-08-17T13:51:57.1457873Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:51:57.1566421Z dist init r=1, world=2 2022-08-17T13:51:57.1570528Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:51:57.1571292Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:57.1663129Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:51:58.5230887Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:51:58.5231393Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:51:58.5471256Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:58.5471847Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:58.5502885Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:51:58.5503427Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:51:59.4578627Z ok (3.912s) 2022-08-17T13:51:59.4584615Z test_fsdp_calc_grad_norm_norm_type_2_0_nested_fsdp_False (__main__.TestCalcuGradNorm) 2022-08-17T13:51:59.4597907Z Test grad norm cal API. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95615 2022-08-17T13:51:59.4604121Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95616 2022-08-17T13:52:00.8923596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:00.8924160Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:00.8926633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:00.8927365Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:00.9084513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:00.9084990Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:00.9089155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:00.9089635Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:01.0595228Z dist init r=1, world=2 2022-08-17T13:52:01.0599137Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:52:01.0733912Z dist init r=0, world=2 2022-08-17T13:52:01.0738334Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:52:01.0739217Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:01.0804608Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:02.4357717Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:52:02.4358252Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:52:02.4582376Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:02.4582963Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:02.4583943Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:02.4584511Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:03.3703414Z ok (3.912s) 2022-08-17T13:52:03.3709695Z test_fsdp_calc_grad_norm_norm_type_2_0_nested_fsdp_True (__main__.TestCalcuGradNorm) 2022-08-17T13:52:03.3723056Z Test grad norm cal API. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95698 2022-08-17T13:52:03.3729314Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95699 2022-08-17T13:52:04.8548075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:04.8548645Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:04.8550843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:04.8551351Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:04.9029876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:04.9030359Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:04.9034393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:04.9034875Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:05.0216370Z dist init r=1, world=2 2022-08-17T13:52:05.0220155Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:52:05.0738686Z dist init r=0, world=2 2022-08-17T13:52:05.0743322Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:52:05.0744350Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:05.0832225Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:06.4671496Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:52:06.4672048Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:52:06.4908407Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:06.4908971Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:06.4909665Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:06.4910213Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:07.3828542Z ok (4.012s) 2022-08-17T13:52:07.3834470Z test_fsdp_calc_grad_norm_norm_type_2_5_nested_fsdp_False (__main__.TestCalcuGradNorm) 2022-08-17T13:52:07.3847948Z Test grad norm cal API. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95781 2022-08-17T13:52:07.3854283Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95782 2022-08-17T13:52:08.8136075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:08.8137001Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:08.8139568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:08.8140496Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:08.8497680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:08.8498638Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:08.8502058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:08.8503007Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:08.9806430Z dist init r=1, world=2 2022-08-17T13:52:08.9811586Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:52:09.0176772Z dist init r=0, world=2 2022-08-17T13:52:09.0181140Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:52:09.0181895Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:09.0220070Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:10.3892503Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:52:10.3893034Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:52:10.4100951Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:10.4101520Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:10.4102214Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:10.4102734Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:11.2950147Z ok (3.912s) 2022-08-17T13:52:11.2955957Z test_fsdp_calc_grad_norm_norm_type_2_5_nested_fsdp_True (__main__.TestCalcuGradNorm) 2022-08-17T13:52:11.2968649Z Test grad norm cal API. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95864 2022-08-17T13:52:11.2974966Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95865 2022-08-17T13:52:12.7765161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:12.7765660Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:12.7767803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:12.7768275Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:12.8030085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:12.8030549Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:12.8034958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:12.8035422Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:12.9426126Z dist init r=0, world=2 2022-08-17T13:52:12.9430206Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:52:12.9790194Z dist init r=1, world=2 2022-08-17T13:52:12.9795072Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:52:12.9796204Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:12.9838567Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:14.3409164Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:52:14.3409765Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:52:14.3630345Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:14.3630907Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:14.3631613Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:14.3632139Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:15.2073173Z ok (3.912s) 2022-08-17T13:52:15.2078841Z test_fsdp_calc_grad_norm_norm_type_inf_nested_fsdp_False (__main__.TestCalcuGradNorm) 2022-08-17T13:52:15.2092431Z Test grad norm cal API. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 95947 2022-08-17T13:52:15.2100524Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 95948 2022-08-17T13:52:16.6826325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:16.6826839Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:16.6829832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:16.6830541Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:16.7216908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:16.7217646Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:16.7221239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:16.7221978Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:16.8483331Z dist init r=1, world=2 2022-08-17T13:52:16.8487330Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:52:16.8932290Z dist init r=0, world=2 2022-08-17T13:52:16.8936554Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:52:16.8938006Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:16.8997938Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:18.2682876Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:52:18.2683777Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:52:18.2901617Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:18.2902699Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:18.2903690Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:18.2904248Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:18.7018685Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:4222: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:18.7058661Z local_norm = torch.tensor(max(par.grad.detach().abs().max() for par in parameters)) 2022-08-17T13:52:18.7060184Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:4222: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:18.7060908Z local_norm = torch.tensor(max(par.grad.detach().abs().max() for par in parameters)) 2022-08-17T13:52:19.1199040Z ok (3.912s) 2022-08-17T13:52:19.1206554Z test_fsdp_calc_grad_norm_norm_type_inf_nested_fsdp_True (__main__.TestCalcuGradNorm) 2022-08-17T13:52:19.1220411Z Test grad norm cal API. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96030 2022-08-17T13:52:19.1226778Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96031 2022-08-17T13:52:20.6058682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:20.6059434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:20.6061673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:20.6062172Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:20.6280555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:20.6281002Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:20.6285297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:20.6285775Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:20.7783436Z dist init r=1, world=2 2022-08-17T13:52:20.7787573Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:52:20.7953910Z dist init r=0, world=2 2022-08-17T13:52:20.7958163Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:52:20.7959241Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:20.7992832Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:22.1722418Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:52:22.1722929Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:52:22.1948419Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:22.1948983Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:22.1949681Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:22.1950528Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:22.6283234Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:4222: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:22.6284021Z local_norm = torch.tensor(max(par.grad.detach().abs().max() for par in parameters)) 2022-08-17T13:52:22.6287717Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:4222: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:22.6288428Z local_norm = torch.tensor(max(par.grad.detach().abs().max() for par in parameters)) 2022-08-17T13:52:23.0325591Z ok (3.913s) 2022-08-17T13:52:23.0331836Z test_fsdp_clip_grad_norm_norm_type_2_0_nested_fsdp_False_cpu_offload_CPUOffload(offload_params=False) (__main__.TestClipGradNorm) 2022-08-17T13:52:23.0346704Z Test FSDP with clip grad norm. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96113 2022-08-17T13:52:23.0352674Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96114 2022-08-17T13:52:24.5143160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:24.5143917Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:24.5145832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:24.5146316Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:24.5376513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:24.5376975Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:24.5380896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:24.5381379Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:24.6802838Z dist init r=0, world=2 2022-08-17T13:52:24.6806802Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:52:24.7090764Z dist init r=1, world=2 2022-08-17T13:52:24.7095319Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:52:24.7096025Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:24.7113125Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:26.0857754Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:52:26.0858635Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:52:26.5163556Z /var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_clip_grad_norm.py:51: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:26.5164262Z in_data = torch.tensor(input[self.rank], device=self.rank) 2022-08-17T13:52:26.5169953Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:26.5170519Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:26.5204164Z /var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_clip_grad_norm.py:51: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:26.5204826Z in_data = torch.tensor(input[self.rank], device=self.rank) 2022-08-17T13:52:26.5212066Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:26.5212628Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:26.5252131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1069: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:26.5252787Z return_norm = torch.tensor(total_norm ** norm_type, device=rank) 2022-08-17T13:52:26.5253679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1069: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:26.5254320Z return_norm = torch.tensor(total_norm ** norm_type, device=rank) 2022-08-17T13:52:26.9449297Z ok (3.912s) 2022-08-17T13:52:26.9454521Z test_fsdp_clip_grad_norm_norm_type_2_0_nested_fsdp_False_cpu_offload_CPUOffload(offload_params=True) (__main__.TestClipGradNorm) 2022-08-17T13:52:26.9468177Z Test FSDP with clip grad norm. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96196 2022-08-17T13:52:26.9474410Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96197 2022-08-17T13:52:28.4323473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:28.4324000Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:28.4326939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:28.4327408Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:28.4385099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:28.4385558Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:28.4390025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:28.4390533Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:28.5984948Z dist init r=0, world=2 2022-08-17T13:52:28.5988805Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:52:28.6034931Z dist init r=1, world=2 2022-08-17T13:52:28.6040016Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:52:28.6040810Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:28.6092457Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:29.9716136Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:52:29.9716674Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:52:30.3982134Z /var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_clip_grad_norm.py:51: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:30.3983209Z in_data = torch.tensor(input[self.rank], device=self.rank) 2022-08-17T13:52:30.3988418Z /var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_clip_grad_norm.py:51: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:30.3989076Z in_data = torch.tensor(input[self.rank], device=self.rank) 2022-08-17T13:52:30.3991407Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:30.3991974Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:30.3997690Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:30.3998261Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:30.4048939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1069: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:30.4049638Z return_norm = torch.tensor(total_norm ** norm_type, device=rank) 2022-08-17T13:52:30.4050540Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1069: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:30.4051177Z return_norm = torch.tensor(total_norm ** norm_type, device=rank) 2022-08-17T13:52:30.8571776Z ok (3.912s) 2022-08-17T13:52:30.8576502Z test_fsdp_clip_grad_norm_norm_type_2_0_nested_fsdp_True_cpu_offload_CPUOffload(offload_params=False) (__main__.TestClipGradNorm) 2022-08-17T13:52:30.8590267Z Test FSDP with clip grad norm. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96279 2022-08-17T13:52:30.8596595Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96280 2022-08-17T13:52:32.3243605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:32.3244084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:32.3247300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:32.3247786Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:32.3790960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:32.3791430Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:32.3795533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:32.3796218Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:32.4900930Z dist init r=0, world=2 2022-08-17T13:52:32.4905274Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:52:32.5533176Z dist init r=1, world=2 2022-08-17T13:52:32.5537783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:52:32.5538920Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:32.5618695Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:33.9237618Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:52:33.9238123Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:52:34.3721866Z /var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_clip_grad_norm.py:51: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:34.3722592Z in_data = torch.tensor(input[self.rank], device=self.rank) 2022-08-17T13:52:34.3731527Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:34.3732073Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:34.3806649Z /var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_clip_grad_norm.py:51: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:34.3807312Z in_data = torch.tensor(input[self.rank], device=self.rank) 2022-08-17T13:52:34.3818107Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:34.3818663Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:34.3875789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1069: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:34.3876463Z return_norm = torch.tensor(total_norm ** norm_type, device=rank) 2022-08-17T13:52:34.3877369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1069: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:34.3878012Z return_norm = torch.tensor(total_norm ** norm_type, device=rank) 2022-08-17T13:52:34.8695790Z ok (4.012s) 2022-08-17T13:52:34.8700102Z test_fsdp_clip_grad_norm_norm_type_2_0_nested_fsdp_True_cpu_offload_CPUOffload(offload_params=True) (__main__.TestClipGradNorm) 2022-08-17T13:52:34.8714035Z Test FSDP with clip grad norm. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96362 2022-08-17T13:52:34.8720285Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96363 2022-08-17T13:52:36.3266329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:36.3266912Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:36.3269406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:36.3270127Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:36.3328772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:36.3329236Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:36.3333324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:36.3333807Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:36.4940407Z dist init r=0, world=2 2022-08-17T13:52:36.4944316Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:52:36.5076198Z dist init r=1, world=2 2022-08-17T13:52:36.5080991Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:52:36.5081763Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:36.5149661Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:37.8758810Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:52:37.8759317Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:52:38.3062770Z /var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_clip_grad_norm.py:51: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:38.3063815Z in_data = torch.tensor(input[self.rank], device=self.rank) 2022-08-17T13:52:38.3076221Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:38.3076792Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:38.3100872Z /var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_clip_grad_norm.py:51: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:38.3101522Z in_data = torch.tensor(input[self.rank], device=self.rank) 2022-08-17T13:52:38.3114838Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:38.3115408Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:38.3186653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1069: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:38.3187337Z return_norm = torch.tensor(total_norm ** norm_type, device=rank) 2022-08-17T13:52:38.3188246Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1069: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:38.3188873Z return_norm = torch.tensor(total_norm ** norm_type, device=rank) 2022-08-17T13:52:38.7822175Z ok (3.913s) 2022-08-17T13:52:38.7827179Z test_fsdp_clip_grad_norm_norm_type_inf_nested_fsdp_False_cpu_offload_CPUOffload(offload_params=False) (__main__.TestClipGradNorm) 2022-08-17T13:52:38.7840304Z Test FSDP with clip grad norm. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96445 2022-08-17T13:52:38.7847066Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96446 2022-08-17T13:52:40.2046169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:40.2046675Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:40.2048934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:40.2049408Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:40.2831993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:40.2832465Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:40.2835725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:40.2836195Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:40.3702860Z dist init r=1, world=2 2022-08-17T13:52:40.3707356Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:52:40.4556774Z dist init r=0, world=2 2022-08-17T13:52:40.4561222Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:52:40.4561945Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:40.4623985Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:41.8498098Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:52:41.8498651Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:52:42.2740529Z /var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_clip_grad_norm.py:51: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:42.2741238Z in_data = torch.tensor(input[self.rank], device=self.rank) 2022-08-17T13:52:42.2747716Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:42.2748268Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:42.2862970Z /var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_clip_grad_norm.py:51: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:42.2863825Z in_data = torch.tensor(input[self.rank], device=self.rank) 2022-08-17T13:52:42.2871057Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:42.2871626Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:42.2910983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1069: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:42.2911903Z return_norm = torch.tensor(total_norm ** norm_type, device=rank) 2022-08-17T13:52:42.2912808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1069: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:42.2913981Z return_norm = torch.tensor(total_norm ** norm_type, device=rank) 2022-08-17T13:52:42.2941143Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:4222: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:42.2941862Z local_norm = torch.tensor(max(par.grad.detach().abs().max() for par in parameters)) 2022-08-17T13:52:42.2943077Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:4222: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:42.2944185Z local_norm = torch.tensor(max(par.grad.detach().abs().max() for par in parameters)) 2022-08-17T13:52:42.6944673Z ok (3.912s) 2022-08-17T13:52:42.6949834Z test_fsdp_clip_grad_norm_norm_type_inf_nested_fsdp_False_cpu_offload_CPUOffload(offload_params=True) (__main__.TestClipGradNorm) 2022-08-17T13:52:42.6963047Z Test FSDP with clip grad norm. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96528 2022-08-17T13:52:42.6969148Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96529 2022-08-17T13:52:44.1425412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:44.1425912Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:44.1429019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:44.1429516Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:44.1678187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:44.1678663Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:44.1682899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:44.1683374Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:44.3080686Z dist init r=1, world=2 2022-08-17T13:52:44.3084988Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:52:44.3391908Z dist init r=0, world=2 2022-08-17T13:52:44.3396228Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:52:44.3397202Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:44.3492409Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:45.7134672Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:52:45.7135212Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:52:46.1574271Z /var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_clip_grad_norm.py:51: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:46.1574983Z in_data = torch.tensor(input[self.rank], device=self.rank) 2022-08-17T13:52:46.1582527Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:46.1583118Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:46.1662797Z /var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_clip_grad_norm.py:51: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:46.1663734Z in_data = torch.tensor(input[self.rank], device=self.rank) 2022-08-17T13:52:46.1672275Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:46.1672836Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:46.1719179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1069: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:46.1720018Z return_norm = torch.tensor(total_norm ** norm_type, device=rank) 2022-08-17T13:52:46.1720915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1069: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:46.1721559Z return_norm = torch.tensor(total_norm ** norm_type, device=rank) 2022-08-17T13:52:46.1747847Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:4222: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:46.1748823Z local_norm = torch.tensor(max(par.grad.detach().abs().max() for par in parameters)) 2022-08-17T13:52:46.1749809Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:4222: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:46.1750491Z local_norm = torch.tensor(max(par.grad.detach().abs().max() for par in parameters)) 2022-08-17T13:52:46.6068000Z ok (3.912s) 2022-08-17T13:52:46.6072170Z test_fsdp_clip_grad_norm_norm_type_inf_nested_fsdp_True_cpu_offload_CPUOffload(offload_params=False) (__main__.TestClipGradNorm) 2022-08-17T13:52:46.6085592Z Test FSDP with clip grad norm. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96611 2022-08-17T13:52:46.6092084Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96612 2022-08-17T13:52:48.0447693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:48.0448215Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:48.0450342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:48.0450834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:48.0617052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:48.0617541Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:48.0621383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:48.0621868Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:48.2118490Z dist init r=1, world=2 2022-08-17T13:52:48.2122143Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:52:48.2333629Z dist init r=0, world=2 2022-08-17T13:52:48.2338091Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:52:48.2339147Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:48.2429215Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:49.6084454Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:52:49.6084978Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:52:50.0285145Z /var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_clip_grad_norm.py:51: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:50.0286235Z in_data = torch.tensor(input[self.rank], device=self.rank) 2022-08-17T13:52:50.0295252Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:50.0295819Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:50.0426478Z /var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_clip_grad_norm.py:51: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:50.0427132Z in_data = torch.tensor(input[self.rank], device=self.rank) 2022-08-17T13:52:50.0438138Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:50.0438693Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:50.0497152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1069: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:50.0497818Z return_norm = torch.tensor(total_norm ** norm_type, device=rank) 2022-08-17T13:52:50.0498713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1069: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:50.0499343Z return_norm = torch.tensor(total_norm ** norm_type, device=rank) 2022-08-17T13:52:50.0529385Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:4222: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:50.0530159Z local_norm = torch.tensor(max(par.grad.detach().abs().max() for par in parameters)) 2022-08-17T13:52:50.0531139Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:4222: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:50.0531842Z local_norm = torch.tensor(max(par.grad.detach().abs().max() for par in parameters)) 2022-08-17T13:52:50.5191134Z ok (3.912s) 2022-08-17T13:52:50.5195589Z test_fsdp_clip_grad_norm_norm_type_inf_nested_fsdp_True_cpu_offload_CPUOffload(offload_params=True) (__main__.TestClipGradNorm) 2022-08-17T13:52:50.5208701Z Test FSDP with clip grad norm. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96694 2022-08-17T13:52:50.5214873Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96695 2022-08-17T13:52:51.9477930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:51.9478462Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:51.9481178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:51.9481644Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:51.9808063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:52:51.9808530Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:52:51.9812812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:52:51.9813452Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:52:52.1137823Z dist init r=1, world=2 2022-08-17T13:52:52.1141914Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:52:52.1537866Z dist init r=0, world=2 2022-08-17T13:52:52.1542434Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:52:52.1543412Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:52.1550575Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:52:53.5220623Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:52:53.5221169Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:52:53.9622976Z /var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_clip_grad_norm.py:51: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:53.9624102Z in_data = torch.tensor(input[self.rank], device=self.rank) 2022-08-17T13:52:53.9635620Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:53.9636186Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:53.9810163Z /var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_clip_grad_norm.py:51: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:53.9810846Z in_data = torch.tensor(input[self.rank], device=self.rank) 2022-08-17T13:52:53.9823738Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:52:53.9824300Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:52:53.9892628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1069: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:53.9893347Z return_norm = torch.tensor(total_norm ** norm_type, device=rank) 2022-08-17T13:52:53.9894420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1069: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:53.9895080Z return_norm = torch.tensor(total_norm ** norm_type, device=rank) 2022-08-17T13:52:53.9923730Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:4222: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:53.9924488Z local_norm = torch.tensor(max(par.grad.detach().abs().max() for par in parameters)) 2022-08-17T13:52:53.9925446Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:4222: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). 2022-08-17T13:52:53.9926394Z local_norm = torch.tensor(max(par.grad.detach().abs().max() for par in parameters)) 2022-08-17T13:52:54.4311766Z ok (3.912s) 2022-08-17T13:52:54.4311940Z 2022-08-17T13:52:54.4312453Z ---------------------------------------------------------------------- 2022-08-17T13:52:54.4312921Z Ran 16 tests in 64.319s 2022-08-17T13:52:54.4313103Z 2022-08-17T13:52:54.4313200Z OK 2022-08-17T13:52:54.4313321Z 2022-08-17T13:52:54.4317103Z Generating XML reports... 2022-08-17T13:52:54.4357932Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_clip_grad_norm/TEST-TestCalcuGradNorm-20220817135150.xml 2022-08-17T13:52:54.4368000Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_clip_grad_norm/TEST-TestClipGradNorm-20220817135150.xml 2022-08-17T13:52:54.7852146Z Running distributed/algorithms/quantization/test_quantization ... [2022-08-17 13:52:54.784764] 2022-08-17T13:52:54.7859617Z /usr/bin/mpiexec 2022-08-17T13:52:54.7860155Z MPI not available -- MPI backend tests will be skipped 2022-08-17T13:52:54.7868367Z Running distributed tests for the test backend with env init_method 2022-08-17T13:52:54.7869185Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/algorithms/quantization/test_quantization.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:52:54.786613] 2022-08-17T13:52:56.3814701Z 2022-08-17T13:52:56.6022968Z Running distributed tests for the test backend with file init_method 2022-08-17T13:52:56.6025201Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/algorithms/quantization/test_quantization.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:52:56.602173] 2022-08-17T13:52:58.1634432Z 2022-08-17T13:52:58.3873107Z Running distributed tests for the nccl backend with env init_method 2022-08-17T13:52:58.3874670Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/algorithms/quantization/test_quantization.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:52:58.387183] 2022-08-17T13:52:59.9293583Z , <__main__.DistQuantizationTests testMethod=test_all_gather_fp16>, <__main__.DistQuantizationTests testMethod=test_all_to_all_bfp16>, <__main__.DistQuantizationTests testMethod=test_all_to_all_fp16>, <__main__.DistQuantizationTests testMethod=test_all_to_all_single_bfp16>, <__main__.DistQuantizationTests testMethod=test_all_to_all_single_fp16>]> 2022-08-17T13:52:59.9294627Z test_all_gather_bfp16 (__main__.DistQuantizationTests) 2022-08-17T13:52:59.9295003Z test_all_gather_fp16 (__main__.DistQuantizationTests) 2022-08-17T13:52:59.9295347Z test_all_to_all_bfp16 (__main__.DistQuantizationTests) 2022-08-17T13:52:59.9295704Z test_all_to_all_fp16 (__main__.DistQuantizationTests) 2022-08-17T13:52:59.9296080Z test_all_to_all_single_bfp16 (__main__.DistQuantizationTests) 2022-08-17T13:52:59.9296444Z test_all_to_all_single_fp16 (__main__.DistQuantizationTests) 2022-08-17T13:52:59.9296798Z 2022-08-17T13:53:01.3153198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:01.3154008Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:01.3156879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:01.3157375Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:01.4884434Z Test results will be stored in test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization 2022-08-17T13:53:01.4901479Z 2022-08-17T13:53:01.4901736Z Running tests... 2022-08-17T13:53:01.4902184Z ---------------------------------------------------------------------- 2022-08-17T13:53:01.4910915Z test_all_gather_bfp16 (__main__.DistQuantizationTests) ... skip: Only gloo backend supports all_gather_fp16 (0.001s) 2022-08-17T13:53:01.4911456Z 2022-08-17T13:53:01.4911753Z ---------------------------------------------------------------------- 2022-08-17T13:53:01.4912067Z Ran 1 test in 0.001s 2022-08-17T13:53:01.4912228Z 2022-08-17T13:53:01.4912342Z OK (skipped=1) 2022-08-17T13:53:01.4912498Z 2022-08-17T13:53:01.4912623Z Generating XML reports... 2022-08-17T13:53:01.4945076Z Generated XML report: test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135301.xml 2022-08-17T13:53:03.1388484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:03.1388970Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:03.1392092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:03.1392601Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:03.3144793Z Test results will be stored in test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization 2022-08-17T13:53:03.3162981Z 2022-08-17T13:53:03.3163239Z Running tests... 2022-08-17T13:53:03.3163893Z ---------------------------------------------------------------------- 2022-08-17T13:53:03.3172110Z test_all_gather_fp16 (__main__.DistQuantizationTests) ... skip: Only gloo backend supports all_gather_fp16 (0.001s) 2022-08-17T13:53:03.3172697Z 2022-08-17T13:53:03.3173291Z ---------------------------------------------------------------------- 2022-08-17T13:53:03.3173790Z Ran 1 test in 0.001s 2022-08-17T13:53:03.3173939Z 2022-08-17T13:53:03.3174050Z OK (skipped=1) 2022-08-17T13:53:03.3174203Z 2022-08-17T13:53:03.3174329Z Generating XML reports... 2022-08-17T13:53:03.3207531Z Generated XML report: test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135303.xml 2022-08-17T13:53:04.9089090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:04.9089601Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:04.9092277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:04.9092773Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:05.0757579Z Test results will be stored in test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization 2022-08-17T13:53:05.0773876Z 2022-08-17T13:53:05.0774339Z Running tests... 2022-08-17T13:53:05.0774851Z ---------------------------------------------------------------------- 2022-08-17T13:53:06.5609896Z test_all_to_all_bfp16 (__main__.DistQuantizationTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:53:06.5796986Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 96978 2022-08-17T13:53:06.5802769Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 96979 2022-08-17T13:53:08.0173328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:08.0174209Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:08.0175637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:08.0176432Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:08.0264475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:08.0265255Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:08.0269437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:08.0270172Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:08.1816302Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:53:08.1819208Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:53:08.1981562Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:53:08.1985049Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:53:08.1986421Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:53:08.1988933Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:53:08.2024527Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:53:08.2027082Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:53:08.2028426Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:53:08.2092275Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:53:10.9911195Z ok (5.913s) 2022-08-17T13:53:10.9911525Z 2022-08-17T13:53:10.9912332Z ---------------------------------------------------------------------- 2022-08-17T13:53:10.9912825Z Ran 1 test in 5.914s 2022-08-17T13:53:10.9912995Z 2022-08-17T13:53:10.9913069Z OK 2022-08-17T13:53:10.9913204Z 2022-08-17T13:53:10.9913337Z Generating XML reports... 2022-08-17T13:53:10.9949027Z Generated XML report: test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135305.xml 2022-08-17T13:53:12.7485817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:12.7486354Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:12.7488686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:12.7489180Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:12.9209653Z Test results will be stored in test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization 2022-08-17T13:53:12.9226907Z 2022-08-17T13:53:12.9227314Z Running tests... 2022-08-17T13:53:12.9227808Z ---------------------------------------------------------------------- 2022-08-17T13:53:14.4375021Z test_all_to_all_fp16 (__main__.DistQuantizationTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:53:14.4563677Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 97097 2022-08-17T13:53:14.4569697Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 97098 2022-08-17T13:53:15.8700298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:15.8700802Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:15.8703799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:15.8704340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:15.9047940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:15.9048418Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:15.9051697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:15.9052181Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:16.0358725Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:53:16.0361487Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:53:16.0788743Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:53:16.0792357Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:53:16.0793470Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:53:16.0796463Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:53:16.0870913Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:53:16.0872995Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:53:16.0873688Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:53:16.0899519Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:53:18.9684275Z ok (6.045s) 2022-08-17T13:53:18.9684469Z 2022-08-17T13:53:18.9685418Z ---------------------------------------------------------------------- 2022-08-17T13:53:18.9685989Z Ran 1 test in 6.046s 2022-08-17T13:53:18.9686247Z 2022-08-17T13:53:18.9686403Z OK 2022-08-17T13:53:18.9686650Z 2022-08-17T13:53:18.9686909Z Generating XML reports... 2022-08-17T13:53:18.9728360Z Generated XML report: test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135312.xml 2022-08-17T13:53:20.7494308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:20.7494838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:20.7497641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:20.7498118Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:20.9221883Z Test results will be stored in test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization 2022-08-17T13:53:20.9239107Z 2022-08-17T13:53:20.9239338Z Running tests... 2022-08-17T13:53:20.9239770Z ---------------------------------------------------------------------- 2022-08-17T13:53:22.4330484Z test_all_to_all_single_bfp16 (__main__.DistQuantizationTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:53:22.4518469Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 97216 2022-08-17T13:53:22.4524776Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 97217 2022-08-17T13:53:23.8763199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:23.8763736Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:23.8766231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:23.8766734Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:23.9346417Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:23.9346892Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:23.9350881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:23.9351371Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:24.0423088Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:53:24.0426353Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:53:24.1038649Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:53:24.1042036Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:53:24.1043322Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:53:24.1045860Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:53:24.1140926Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:53:24.1143375Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:53:24.1144431Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:53:24.1149065Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:53:26.9640599Z ok (6.040s) 2022-08-17T13:53:26.9640805Z 2022-08-17T13:53:26.9641287Z ---------------------------------------------------------------------- 2022-08-17T13:53:26.9641613Z Ran 1 test in 6.040s 2022-08-17T13:53:26.9641779Z 2022-08-17T13:53:26.9641874Z OK 2022-08-17T13:53:26.9642008Z 2022-08-17T13:53:26.9642142Z Generating XML reports... 2022-08-17T13:53:26.9677314Z Generated XML report: test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135320.xml 2022-08-17T13:53:28.7176735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:28.7177282Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:28.7179990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:28.7180478Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:28.8903432Z Test results will be stored in test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization 2022-08-17T13:53:28.8920487Z 2022-08-17T13:53:28.8920813Z Running tests... 2022-08-17T13:53:28.8921244Z ---------------------------------------------------------------------- 2022-08-17T13:53:30.4139866Z test_all_to_all_single_fp16 (__main__.DistQuantizationTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:53:30.4329296Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 97335 2022-08-17T13:53:30.4335301Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 97336 2022-08-17T13:53:31.8363611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:31.8364144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:31.8366981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:31.8367499Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:31.8586343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:31.8586803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:31.8591168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:31.8591654Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:32.0092683Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:53:32.0095309Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:53:32.0263102Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:53:32.0266814Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:53:32.0267713Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:53:32.0270219Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:53:32.0300104Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:53:32.0302696Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:53:32.0303414Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:53:32.0373445Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:53:34.8448977Z ok (5.953s) 2022-08-17T13:53:34.8449250Z 2022-08-17T13:53:34.8449650Z ---------------------------------------------------------------------- 2022-08-17T13:53:34.8449993Z Ran 1 test in 5.953s 2022-08-17T13:53:34.8450160Z 2022-08-17T13:53:34.8450253Z OK 2022-08-17T13:53:34.8450371Z 2022-08-17T13:53:34.8450507Z Generating XML reports... 2022-08-17T13:53:34.8489583Z Generated XML report: test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135328.xml 2022-08-17T13:53:35.3961847Z Running distributed tests for the nccl backend with file init_method 2022-08-17T13:53:35.3963144Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/algorithms/quantization/test_quantization.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:53:35.395999] 2022-08-17T13:53:36.9880161Z , <__main__.DistQuantizationTests testMethod=test_all_gather_fp16>, <__main__.DistQuantizationTests testMethod=test_all_to_all_bfp16>, <__main__.DistQuantizationTests testMethod=test_all_to_all_fp16>, <__main__.DistQuantizationTests testMethod=test_all_to_all_single_bfp16>, <__main__.DistQuantizationTests testMethod=test_all_to_all_single_fp16>]> 2022-08-17T13:53:36.9881059Z test_all_gather_bfp16 (__main__.DistQuantizationTests) 2022-08-17T13:53:36.9881454Z test_all_gather_fp16 (__main__.DistQuantizationTests) 2022-08-17T13:53:36.9881800Z test_all_to_all_bfp16 (__main__.DistQuantizationTests) 2022-08-17T13:53:36.9882160Z test_all_to_all_fp16 (__main__.DistQuantizationTests) 2022-08-17T13:53:36.9882527Z test_all_to_all_single_bfp16 (__main__.DistQuantizationTests) 2022-08-17T13:53:36.9882888Z test_all_to_all_single_fp16 (__main__.DistQuantizationTests) 2022-08-17T13:53:36.9883243Z 2022-08-17T13:53:38.3842690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:38.3843487Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:38.3846115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:38.5572475Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:38.5573107Z Test results will be stored in test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization 2022-08-17T13:53:38.5590364Z 2022-08-17T13:53:38.5590588Z Running tests... 2022-08-17T13:53:38.5591167Z ---------------------------------------------------------------------- 2022-08-17T13:53:38.5599365Z test_all_gather_bfp16 (__main__.DistQuantizationTests) ... skip: Only gloo backend supports all_gather_fp16 (0.001s) 2022-08-17T13:53:38.5600137Z 2022-08-17T13:53:38.5600464Z ---------------------------------------------------------------------- 2022-08-17T13:53:38.5600801Z Ran 1 test in 0.001s 2022-08-17T13:53:38.5600960Z 2022-08-17T13:53:38.5601069Z OK (skipped=1) 2022-08-17T13:53:38.5601227Z 2022-08-17T13:53:38.5601352Z Generating XML reports... 2022-08-17T13:53:38.5634031Z Generated XML report: test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135338.xml 2022-08-17T13:53:40.1424364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:40.1424860Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:40.1428212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:40.1428943Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:40.3201903Z Test results will be stored in test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization 2022-08-17T13:53:40.3219295Z 2022-08-17T13:53:40.3219555Z Running tests... 2022-08-17T13:53:40.3220067Z ---------------------------------------------------------------------- 2022-08-17T13:53:40.3228526Z test_all_gather_fp16 (__main__.DistQuantizationTests) ... skip: Only gloo backend supports all_gather_fp16 (0.001s) 2022-08-17T13:53:40.3228945Z 2022-08-17T13:53:40.3229233Z ---------------------------------------------------------------------- 2022-08-17T13:53:40.3229569Z Ran 1 test in 0.001s 2022-08-17T13:53:40.3229732Z 2022-08-17T13:53:40.3229840Z OK (skipped=1) 2022-08-17T13:53:40.3229996Z 2022-08-17T13:53:40.3230125Z Generating XML reports... 2022-08-17T13:53:40.3263158Z Generated XML report: test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135340.xml 2022-08-17T13:53:41.9509407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:41.9509921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:41.9513028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:42.1258033Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:42.1258669Z Test results will be stored in test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization 2022-08-17T13:53:42.1276203Z 2022-08-17T13:53:42.1276557Z Running tests... 2022-08-17T13:53:42.1276998Z ---------------------------------------------------------------------- 2022-08-17T13:53:43.6429888Z test_all_to_all_bfp16 (__main__.DistQuantizationTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:53:43.6619305Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 97553 2022-08-17T13:53:43.6626222Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 97554 2022-08-17T13:53:45.0667328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:45.0668130Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:45.0671301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:45.0671785Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:45.0976674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:45.0977133Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:45.0981686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:45.0982164Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:45.2334955Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:53:45.2338155Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:53:45.2695523Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:53:45.2699731Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:53:45.2700650Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:53:45.2703121Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:53:45.2746078Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:53:45.2748928Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:53:45.2749814Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:53:45.2806725Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:53:48.0738916Z ok (5.946s) 2022-08-17T13:53:48.0739121Z 2022-08-17T13:53:48.0739495Z ---------------------------------------------------------------------- 2022-08-17T13:53:48.0739832Z Ran 1 test in 5.946s 2022-08-17T13:53:48.0739979Z 2022-08-17T13:53:48.0740078Z OK 2022-08-17T13:53:48.0740214Z 2022-08-17T13:53:48.0740351Z Generating XML reports... 2022-08-17T13:53:48.0776342Z Generated XML report: test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135342.xml 2022-08-17T13:53:49.8389591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:49.8390168Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:49.8393265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:49.8393762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:50.0141415Z Test results will be stored in test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization 2022-08-17T13:53:50.0159188Z 2022-08-17T13:53:50.0159624Z Running tests... 2022-08-17T13:53:50.0160116Z ---------------------------------------------------------------------- 2022-08-17T13:53:51.5489876Z test_all_to_all_fp16 (__main__.DistQuantizationTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:53:51.5672829Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 97672 2022-08-17T13:53:51.5679029Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 97673 2022-08-17T13:53:52.9499527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:52.9500015Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:52.9502727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:52.9503231Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:52.9755000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:52.9755442Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:52.9760018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:52.9760486Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:53.1155445Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:53:53.1157783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:53:53.1465512Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:53:53.1469574Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:53:53.1470661Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:53:53.1473475Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:53:53.1567087Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:53:53.1569729Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:53:53.1570407Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:53:53.1576724Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:53:56.0791921Z ok (6.063s) 2022-08-17T13:53:56.0792208Z 2022-08-17T13:53:56.0792734Z ---------------------------------------------------------------------- 2022-08-17T13:53:56.0793078Z Ran 1 test in 6.063s 2022-08-17T13:53:56.0793241Z 2022-08-17T13:53:56.0793334Z OK 2022-08-17T13:53:56.0793468Z 2022-08-17T13:53:56.0793586Z Generating XML reports... 2022-08-17T13:53:56.0829283Z Generated XML report: test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135350.xml 2022-08-17T13:53:57.8440286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:53:57.8440793Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:53:57.8442937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:53:57.8443420Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:53:58.0142104Z Test results will be stored in test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization 2022-08-17T13:53:58.0158930Z 2022-08-17T13:53:58.0159375Z Running tests... 2022-08-17T13:53:58.0159847Z ---------------------------------------------------------------------- 2022-08-17T13:53:59.5281221Z test_all_to_all_single_bfp16 (__main__.DistQuantizationTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:53:59.5466453Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 97791 2022-08-17T13:53:59.5473353Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 97792 2022-08-17T13:54:00.9234056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:00.9234575Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:00.9237232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:00.9237742Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:00.9510181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:00.9510631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:00.9515040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:00.9515516Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:01.0881977Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:54:01.0884821Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:54:01.1216002Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:54:01.1219724Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:54:01.1220764Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:54:01.1223639Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:54:01.1292231Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:54:01.1294642Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:54:01.1295328Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:54:01.1326781Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:54:03.9586056Z ok (5.942s) 2022-08-17T13:54:03.9586386Z 2022-08-17T13:54:03.9586854Z ---------------------------------------------------------------------- 2022-08-17T13:54:03.9587200Z Ran 1 test in 5.943s 2022-08-17T13:54:03.9587362Z 2022-08-17T13:54:03.9587458Z OK 2022-08-17T13:54:03.9587592Z 2022-08-17T13:54:03.9587731Z Generating XML reports... 2022-08-17T13:54:03.9623238Z Generated XML report: test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135358.xml 2022-08-17T13:54:05.7187108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:05.7187609Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:05.7190407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:05.7190872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:05.8904174Z Test results will be stored in test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization 2022-08-17T13:54:05.8921557Z 2022-08-17T13:54:05.8921987Z Running tests... 2022-08-17T13:54:05.8922487Z ---------------------------------------------------------------------- 2022-08-17T13:54:07.4049756Z test_all_to_all_single_fp16 (__main__.DistQuantizationTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:54:07.4239130Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 97910 2022-08-17T13:54:07.4245425Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 97911 2022-08-17T13:54:08.8357876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:08.8358379Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:08.8361483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:08.8361995Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:08.8405495Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:08.8405939Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:08.8410044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:08.8410521Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:09.0057265Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:54:09.0060194Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:54:09.0072872Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:54:09.0076483Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:54:09.0077607Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:54:09.0080352Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:54:09.0165399Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:54:09.0167698Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:54:09.0168362Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:54:09.0183608Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:54:11.8355769Z ok (5.943s) 2022-08-17T13:54:11.8356118Z 2022-08-17T13:54:11.8356780Z ---------------------------------------------------------------------- 2022-08-17T13:54:11.8357377Z Ran 1 test in 5.943s 2022-08-17T13:54:11.8357688Z 2022-08-17T13:54:11.8357852Z OK 2022-08-17T13:54:11.8358101Z 2022-08-17T13:54:11.8358337Z Generating XML reports... 2022-08-17T13:54:11.8394169Z Generated XML report: test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135405.xml 2022-08-17T13:54:12.4079716Z Running distributed tests for the gloo backend with env init_method 2022-08-17T13:54:12.4082158Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/algorithms/quantization/test_quantization.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:54:12.407835] 2022-08-17T13:54:13.9637520Z , <__main__.DistQuantizationTests testMethod=test_all_gather_fp16>, <__main__.DistQuantizationTests testMethod=test_all_to_all_bfp16>, <__main__.DistQuantizationTests testMethod=test_all_to_all_fp16>, <__main__.DistQuantizationTests testMethod=test_all_to_all_single_bfp16>, <__main__.DistQuantizationTests testMethod=test_all_to_all_single_fp16>]> 2022-08-17T13:54:13.9638442Z test_all_gather_bfp16 (__main__.DistQuantizationTests) 2022-08-17T13:54:13.9638805Z test_all_gather_fp16 (__main__.DistQuantizationTests) 2022-08-17T13:54:13.9639180Z test_all_to_all_bfp16 (__main__.DistQuantizationTests) 2022-08-17T13:54:13.9639542Z test_all_to_all_fp16 (__main__.DistQuantizationTests) 2022-08-17T13:54:13.9639919Z test_all_to_all_single_bfp16 (__main__.DistQuantizationTests) 2022-08-17T13:54:13.9640284Z test_all_to_all_single_fp16 (__main__.DistQuantizationTests) 2022-08-17T13:54:13.9640644Z 2022-08-17T13:54:15.3442393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:15.3443801Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:15.3445654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:15.3446591Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:15.5198126Z Test results will be stored in test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization 2022-08-17T13:54:15.5216093Z 2022-08-17T13:54:15.5216477Z Running tests... 2022-08-17T13:54:15.5216971Z ---------------------------------------------------------------------- 2022-08-17T13:54:17.0395013Z test_all_gather_bfp16 (__main__.DistQuantizationTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:54:17.0585559Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 98062 2022-08-17T13:54:17.0591999Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 98063 2022-08-17T13:54:18.4556675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:18.4557620Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:18.4558808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:18.4559755Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:18.4741976Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:18.4743124Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:18.4748364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:18.4749360Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:18.6208206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:54:18.6384266Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:54:18.6597920Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:54:18.6598899Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:54:18.6600223Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:54:18.6601446Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:54:18.9653103Z ok (3.443s) 2022-08-17T13:54:18.9653321Z 2022-08-17T13:54:18.9653721Z ---------------------------------------------------------------------- 2022-08-17T13:54:18.9654064Z Ran 1 test in 3.444s 2022-08-17T13:54:18.9654233Z 2022-08-17T13:54:18.9654334Z OK 2022-08-17T13:54:18.9654452Z 2022-08-17T13:54:18.9654598Z Generating XML reports... 2022-08-17T13:54:18.9690238Z Generated XML report: test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135415.xml 2022-08-17T13:54:20.7088498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:20.7089024Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:20.7091803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:20.7092298Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:20.8824991Z Test results will be stored in test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization 2022-08-17T13:54:20.8842249Z 2022-08-17T13:54:20.8842681Z Running tests... 2022-08-17T13:54:20.8843220Z ---------------------------------------------------------------------- 2022-08-17T13:54:22.3967361Z test_all_gather_fp16 (__main__.DistQuantizationTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:54:22.4148231Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 98171 2022-08-17T13:54:22.4154845Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 98172 2022-08-17T13:54:23.8381045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:23.8381857Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:23.8383769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:23.8384386Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:23.8401671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:23.8402405Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:23.8406451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:23.8407178Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:24.0061510Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:54:24.0069510Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:54:24.0272262Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:54:24.0276092Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:54:24.0276935Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:54:24.0374347Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:54:24.3207588Z ok (3.436s) 2022-08-17T13:54:24.3207883Z 2022-08-17T13:54:24.3208435Z ---------------------------------------------------------------------- 2022-08-17T13:54:24.3208763Z Ran 1 test in 3.436s 2022-08-17T13:54:24.3208928Z 2022-08-17T13:54:24.3209020Z OK 2022-08-17T13:54:24.3209154Z 2022-08-17T13:54:24.3209286Z Generating XML reports... 2022-08-17T13:54:24.3243739Z Generated XML report: test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135420.xml 2022-08-17T13:54:26.0576087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:26.0576611Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:26.0579486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:26.0579963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:26.2308585Z Test results will be stored in test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization 2022-08-17T13:54:26.2324992Z 2022-08-17T13:54:26.2325432Z Running tests... 2022-08-17T13:54:26.2325911Z ---------------------------------------------------------------------- 2022-08-17T13:54:26.2336272Z test_all_to_all_bfp16 (__main__.DistQuantizationTests) ... skip: Only nccl backend supports all_to_all_fp16 (0.001s) 2022-08-17T13:54:26.2336583Z 2022-08-17T13:54:26.2336877Z ---------------------------------------------------------------------- 2022-08-17T13:54:26.2337208Z Ran 1 test in 0.001s 2022-08-17T13:54:26.2337375Z 2022-08-17T13:54:26.2337490Z OK (skipped=1) 2022-08-17T13:54:26.2337645Z 2022-08-17T13:54:26.2337769Z Generating XML reports... 2022-08-17T13:54:26.2370486Z Generated XML report: test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135426.xml 2022-08-17T13:54:27.8409851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:27.8410353Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:27.8412639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:27.8413111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:28.0122755Z Test results will be stored in test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization 2022-08-17T13:54:28.0139438Z 2022-08-17T13:54:28.0139654Z Running tests... 2022-08-17T13:54:28.0140094Z ---------------------------------------------------------------------- 2022-08-17T13:54:28.0149786Z test_all_to_all_fp16 (__main__.DistQuantizationTests) ... skip: Only nccl backend supports all_to_all_fp16 (0.001s) 2022-08-17T13:54:28.0150090Z 2022-08-17T13:54:28.0150384Z ---------------------------------------------------------------------- 2022-08-17T13:54:28.0150696Z Ran 1 test in 0.001s 2022-08-17T13:54:28.0150864Z 2022-08-17T13:54:28.0150977Z OK (skipped=1) 2022-08-17T13:54:28.0151131Z 2022-08-17T13:54:28.0151258Z Generating XML reports... 2022-08-17T13:54:28.0182894Z Generated XML report: test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135428.xml 2022-08-17T13:54:29.5967327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:29.5967836Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:29.5969982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:29.5970476Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:29.7639387Z Test results will be stored in test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization 2022-08-17T13:54:29.7655302Z 2022-08-17T13:54:29.7655732Z Running tests... 2022-08-17T13:54:29.7656217Z ---------------------------------------------------------------------- 2022-08-17T13:54:29.7665485Z test_all_to_all_single_bfp16 (__main__.DistQuantizationTests) ... skip: Only nccl backend supports all_to_all_single_bfp16 (0.001s) 2022-08-17T13:54:29.7665815Z 2022-08-17T13:54:29.7666099Z ---------------------------------------------------------------------- 2022-08-17T13:54:29.7666415Z Ran 1 test in 0.001s 2022-08-17T13:54:29.7666580Z 2022-08-17T13:54:29.7666692Z OK (skipped=1) 2022-08-17T13:54:29.7666847Z 2022-08-17T13:54:29.7666974Z Generating XML reports... 2022-08-17T13:54:29.7698044Z Generated XML report: test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135429.xml 2022-08-17T13:54:31.3942329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:31.3942848Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:31.3944692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:31.3945177Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:31.5690511Z Test results will be stored in test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization 2022-08-17T13:54:31.5707712Z 2022-08-17T13:54:31.5708115Z Running tests... 2022-08-17T13:54:31.5708579Z ---------------------------------------------------------------------- 2022-08-17T13:54:31.5717785Z test_all_to_all_single_fp16 (__main__.DistQuantizationTests) ... skip: Only nccl backend supports all_to_all_single_fp16 (0.001s) 2022-08-17T13:54:31.5718137Z 2022-08-17T13:54:31.5718416Z ---------------------------------------------------------------------- 2022-08-17T13:54:31.5718746Z Ran 1 test in 0.001s 2022-08-17T13:54:31.5719147Z 2022-08-17T13:54:31.5719257Z OK (skipped=1) 2022-08-17T13:54:31.5719417Z 2022-08-17T13:54:31.5719543Z Generating XML reports... 2022-08-17T13:54:31.5752234Z Generated XML report: test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135431.xml 2022-08-17T13:54:32.0170382Z Running distributed tests for the gloo backend with file init_method 2022-08-17T13:54:32.0172035Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/algorithms/quantization/test_quantization.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:54:32.016906] 2022-08-17T13:54:33.6049777Z , <__main__.DistQuantizationTests testMethod=test_all_gather_fp16>, <__main__.DistQuantizationTests testMethod=test_all_to_all_bfp16>, <__main__.DistQuantizationTests testMethod=test_all_to_all_fp16>, <__main__.DistQuantizationTests testMethod=test_all_to_all_single_bfp16>, <__main__.DistQuantizationTests testMethod=test_all_to_all_single_fp16>]> 2022-08-17T13:54:33.6051084Z test_all_gather_bfp16 (__main__.DistQuantizationTests) 2022-08-17T13:54:33.6051474Z test_all_gather_fp16 (__main__.DistQuantizationTests) 2022-08-17T13:54:33.6051840Z test_all_to_all_bfp16 (__main__.DistQuantizationTests) 2022-08-17T13:54:33.6052217Z test_all_to_all_fp16 (__main__.DistQuantizationTests) 2022-08-17T13:54:33.6052606Z test_all_to_all_single_bfp16 (__main__.DistQuantizationTests) 2022-08-17T13:54:33.6052991Z test_all_to_all_single_fp16 (__main__.DistQuantizationTests) 2022-08-17T13:54:33.6053356Z 2022-08-17T13:54:34.9561167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:34.9561689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:34.9564181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:34.9564672Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:35.1287200Z Test results will be stored in test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization 2022-08-17T13:54:35.1304494Z 2022-08-17T13:54:35.1304764Z Running tests... 2022-08-17T13:54:35.1305438Z ---------------------------------------------------------------------- 2022-08-17T13:54:36.6690816Z test_all_gather_bfp16 (__main__.DistQuantizationTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:54:36.6880075Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 98445 2022-08-17T13:54:36.6886334Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 98446 2022-08-17T13:54:38.1438075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:38.1438688Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:38.1441281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:38.1441771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:38.1474212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:38.1474675Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:38.1479084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:38.1479563Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:38.3115434Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:54:38.3148319Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:54:38.3359982Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:54:38.3360556Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:54:38.3361332Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:54:38.3362068Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:54:38.6942680Z ok (3.563s) 2022-08-17T13:54:38.6943088Z 2022-08-17T13:54:38.6943990Z ---------------------------------------------------------------------- 2022-08-17T13:54:38.6944612Z Ran 1 test in 3.564s 2022-08-17T13:54:38.6945312Z 2022-08-17T13:54:38.6945485Z OK 2022-08-17T13:54:38.6945711Z 2022-08-17T13:54:38.6945946Z Generating XML reports... 2022-08-17T13:54:38.6980963Z Generated XML report: test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135435.xml 2022-08-17T13:54:40.4097270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:40.4097777Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:40.4100127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:40.4100619Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:40.5754609Z Test results will be stored in test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization 2022-08-17T13:54:40.5769918Z 2022-08-17T13:54:40.5770132Z Running tests... 2022-08-17T13:54:40.5770591Z ---------------------------------------------------------------------- 2022-08-17T13:54:42.0769849Z test_all_gather_fp16 (__main__.DistQuantizationTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:54:42.0953313Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 98554 2022-08-17T13:54:42.0959669Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 98555 2022-08-17T13:54:43.5233885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:43.5234380Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:43.5236585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:43.5237076Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:43.5426409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:43.5426903Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:43.5431466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:43.5431945Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:43.6875582Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:54:43.7123519Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:54:43.7288601Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:54:43.7289119Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:54:43.7289872Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:54:43.7290597Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:54:44.1017976Z ok (3.524s) 2022-08-17T13:54:44.1018213Z 2022-08-17T13:54:44.1019122Z ---------------------------------------------------------------------- 2022-08-17T13:54:44.1019496Z Ran 1 test in 3.525s 2022-08-17T13:54:44.1019663Z 2022-08-17T13:54:44.1019738Z OK 2022-08-17T13:54:44.1019873Z 2022-08-17T13:54:44.1020007Z Generating XML reports... 2022-08-17T13:54:44.1054706Z Generated XML report: test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135440.xml 2022-08-17T13:54:45.8188747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:45.8189280Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:45.8191708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:45.8192566Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:45.9854239Z Test results will be stored in test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization 2022-08-17T13:54:45.9870447Z 2022-08-17T13:54:45.9870744Z Running tests... 2022-08-17T13:54:45.9871350Z ---------------------------------------------------------------------- 2022-08-17T13:54:45.9881727Z test_all_to_all_bfp16 (__main__.DistQuantizationTests) ... skip: Only nccl backend supports all_to_all_fp16 (0.001s) 2022-08-17T13:54:45.9882308Z 2022-08-17T13:54:45.9882838Z ---------------------------------------------------------------------- 2022-08-17T13:54:45.9883546Z Ran 1 test in 0.001s 2022-08-17T13:54:45.9883796Z 2022-08-17T13:54:45.9883912Z OK (skipped=1) 2022-08-17T13:54:45.9884071Z 2022-08-17T13:54:45.9884199Z Generating XML reports... 2022-08-17T13:54:45.9914411Z Generated XML report: test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135445.xml 2022-08-17T13:54:47.6192225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:47.6192733Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:47.6195648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:47.6196123Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:47.7919624Z Test results will be stored in test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization 2022-08-17T13:54:47.7936984Z 2022-08-17T13:54:47.7937174Z Running tests... 2022-08-17T13:54:47.7938040Z ---------------------------------------------------------------------- 2022-08-17T13:54:47.7948849Z test_all_to_all_fp16 (__main__.DistQuantizationTests) ... skip: Only nccl backend supports all_to_all_fp16 (0.001s) 2022-08-17T13:54:47.7949446Z 2022-08-17T13:54:47.7950038Z ---------------------------------------------------------------------- 2022-08-17T13:54:47.7950636Z Ran 1 test in 0.001s 2022-08-17T13:54:47.7950812Z 2022-08-17T13:54:47.7950924Z OK (skipped=1) 2022-08-17T13:54:47.7951062Z 2022-08-17T13:54:47.7951192Z Generating XML reports... 2022-08-17T13:54:47.7984283Z Generated XML report: test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135447.xml 2022-08-17T13:54:49.4083138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:49.4083700Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:49.4085097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:49.4085926Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:49.5771847Z Test results will be stored in test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization 2022-08-17T13:54:49.5787805Z 2022-08-17T13:54:49.5788120Z Running tests... 2022-08-17T13:54:49.5789341Z ---------------------------------------------------------------------- 2022-08-17T13:54:49.5797921Z test_all_to_all_single_bfp16 (__main__.DistQuantizationTests) ... skip: Only nccl backend supports all_to_all_single_bfp16 (0.001s) 2022-08-17T13:54:49.5798558Z 2022-08-17T13:54:49.5799135Z ---------------------------------------------------------------------- 2022-08-17T13:54:49.5799622Z Ran 1 test in 0.001s 2022-08-17T13:54:49.5799790Z 2022-08-17T13:54:49.5799902Z OK (skipped=1) 2022-08-17T13:54:49.5800060Z 2022-08-17T13:54:49.5800169Z Generating XML reports... 2022-08-17T13:54:49.5830806Z Generated XML report: test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135449.xml 2022-08-17T13:54:51.1957077Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:51.1957584Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:51.1960657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:51.1961152Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:51.3712829Z Test results will be stored in test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization 2022-08-17T13:54:51.3729791Z 2022-08-17T13:54:51.3729932Z Running tests... 2022-08-17T13:54:51.3730350Z ---------------------------------------------------------------------- 2022-08-17T13:54:51.3740548Z test_all_to_all_single_fp16 (__main__.DistQuantizationTests) ... skip: Only nccl backend supports all_to_all_single_fp16 (0.001s) 2022-08-17T13:54:51.3741076Z 2022-08-17T13:54:51.3741531Z ---------------------------------------------------------------------- 2022-08-17T13:54:51.3741878Z Ran 1 test in 0.001s 2022-08-17T13:54:51.3742052Z 2022-08-17T13:54:51.3742166Z OK (skipped=1) 2022-08-17T13:54:51.3742325Z 2022-08-17T13:54:51.3742461Z Generating XML reports... 2022-08-17T13:54:51.3776000Z Generated XML report: test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135451.xml 2022-08-17T13:54:51.8111427Z Running distributed/test_pg_wrapper ... [2022-08-17 13:54:51.810646] 2022-08-17T13:54:51.8112896Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_pg_wrapper.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:54:51.810717] 2022-08-17T13:54:53.3826623Z 2022-08-17T13:54:53.3827310Z 2022-08-17T13:54:53.3828847Z , <__main__.ProcessGroupGlooWrapperTest testMethod=test_collective_shape_mismatch>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collective_shape_mismatch_cuda>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collective_shape_mismatch_cuda_debug_mode>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collective_shape_mismatch_debug_mode>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collectives_op_mismatch>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collectives_op_mismatch_cuda>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collectives_op_mismatch_cuda_debug_mode>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collectives_op_mismatch_debug_mode>]> 2022-08-17T13:54:53.3830350Z test_collective_hang (__main__.ProcessGroupGlooWrapperTest) 2022-08-17T13:54:53.3830788Z test_collective_shape_mismatch (__main__.ProcessGroupGlooWrapperTest) 2022-08-17T13:54:53.3831247Z test_collective_shape_mismatch_cuda (__main__.ProcessGroupGlooWrapperTest) 2022-08-17T13:54:53.3831711Z test_collective_shape_mismatch_cuda_debug_mode (__main__.ProcessGroupGlooWrapperTest) 2022-08-17T13:54:53.3832194Z test_collective_shape_mismatch_debug_mode (__main__.ProcessGroupGlooWrapperTest) 2022-08-17T13:54:53.3832914Z test_collectives_op_mismatch (__main__.ProcessGroupGlooWrapperTest) 2022-08-17T13:54:53.3833365Z test_collectives_op_mismatch_cuda (__main__.ProcessGroupGlooWrapperTest) 2022-08-17T13:54:53.3833840Z test_collectives_op_mismatch_cuda_debug_mode (__main__.ProcessGroupGlooWrapperTest) 2022-08-17T13:54:53.3834314Z test_collectives_op_mismatch_debug_mode (__main__.ProcessGroupGlooWrapperTest) 2022-08-17T13:54:53.3835336Z , <__main__.ProcessGroupNCCLWrapperTest testMethod=test_collective_shape_mismatch>, <__main__.ProcessGroupNCCLWrapperTest testMethod=test_collective_shape_mismatch_debug_mode>, <__main__.ProcessGroupNCCLWrapperTest testMethod=test_collectives_op_mismatch>, <__main__.ProcessGroupNCCLWrapperTest testMethod=test_collectives_op_mismatch_debug_mode>]> 2022-08-17T13:54:53.3836412Z test_collective_hang (__main__.ProcessGroupNCCLWrapperTest) 2022-08-17T13:54:53.3836827Z test_collective_shape_mismatch (__main__.ProcessGroupNCCLWrapperTest) 2022-08-17T13:54:53.3837286Z test_collective_shape_mismatch_debug_mode (__main__.ProcessGroupNCCLWrapperTest) 2022-08-17T13:54:53.3837737Z test_collectives_op_mismatch (__main__.ProcessGroupNCCLWrapperTest) 2022-08-17T13:54:53.3838174Z test_collectives_op_mismatch_debug_mode (__main__.ProcessGroupNCCLWrapperTest) 2022-08-17T13:54:54.7970819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:54.7971322Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:54.7973926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:54.7974410Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:54.9729478Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-08-17T13:54:54.9745058Z 2022-08-17T13:54:54.9745239Z Running tests... 2022-08-17T13:54:54.9745880Z ---------------------------------------------------------------------- 2022-08-17T13:54:56.5088963Z test_collective_hang (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:54:56.5275589Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 98828 2022-08-17T13:54:56.5281672Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 98829 2022-08-17T13:54:56.5288312Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 98830 2022-08-17T13:54:56.5295228Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 98831 2022-08-17T13:54:57.9196138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:57.9196708Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:57.9197310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:57.9197767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:57.9198527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:57.9199026Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:57.9199623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:57.9200096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:57.9432315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:57.9432804Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:57.9436975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:57.9437744Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:57.9721695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:54:57.9722180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:54:57.9725728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:54:57.9726204Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:54:58.0930638Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:54:58.0941261Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:54:58.1148092Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:54:58.1374108Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:54:58.1649925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:54:58.1752718Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:54:58.1753219Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-08-17T13:54:58.1754763Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-08-17T13:54:58.1755526Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:54:58.1756217Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:54:58.1756920Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:54:58.1757615Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:54:58.2397900Z [E ProcessGroupGloo.cpp:2798] [Rank 0]: Rank 1 failed to pass monitoredBarrier in 2000 ms 2022-08-17T13:54:58.2410891Z [E ProcessGroupGloo.cpp:136] [Rank 0]: Ranks 1 failed to pass monitoredBarrier in 2000 ms 2022-08-17T13:54:58.2516444Z [E ProcessGroupGloo.cpp:136] Rank 2 successfully reached monitoredBarrier, but received errors while waiting for send/recv from rank 0. Please check rank 0 logs for faulty rank. 2022-08-17T13:54:58.2616570Z [E ProcessGroupGloo.cpp:136] Rank 3 successfully reached monitoredBarrier, but received errors while waiting for send/recv from rank 0. Please check rank 0 logs for faulty rank. 2022-08-17T13:54:58.7361093Z ok (3.761s) 2022-08-17T13:54:58.7361310Z 2022-08-17T13:54:58.7361705Z ---------------------------------------------------------------------- 2022-08-17T13:54:58.7362046Z Ran 1 test in 3.761s 2022-08-17T13:54:58.7362229Z 2022-08-17T13:54:58.7362308Z OK 2022-08-17T13:54:58.7362449Z 2022-08-17T13:54:58.7362590Z Generating XML reports... 2022-08-17T13:54:58.7408111Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135454.xml 2022-08-17T13:55:00.5192005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:00.5192513Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:00.5195134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:00.5195623Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:00.6967853Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-08-17T13:55:00.6983836Z 2022-08-17T13:55:00.6984542Z Running tests... 2022-08-17T13:55:00.6985253Z ---------------------------------------------------------------------- 2022-08-17T13:55:02.2509962Z test_collective_shape_mismatch (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:55:02.2694538Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 99035 2022-08-17T13:55:02.2701398Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 99036 2022-08-17T13:55:02.2709747Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 99037 2022-08-17T13:55:02.2716324Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 99038 2022-08-17T13:55:03.6631596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:03.6632652Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:03.6634687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:03.6635175Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:03.6785018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:03.6785457Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:03.6789804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:03.6790273Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:03.6997732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:03.6998171Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:03.7002325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:03.7002799Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:03.7072438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:03.7072875Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:03.7076755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:03.7077226Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:03.8300662Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:55:03.8463342Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:55:03.8725280Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:55:03.8729388Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:55:03.8917302Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:55:03.9083628Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:55:03.9187149Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-08-17T13:55:03.9187667Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-08-17T13:55:03.9188416Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:55:03.9189121Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:55:03.9189823Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:55:03.9226155Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:55:04.4782863Z ok (3.780s) 2022-08-17T13:55:04.4783055Z 2022-08-17T13:55:04.4783435Z ---------------------------------------------------------------------- 2022-08-17T13:55:04.4784058Z Ran 1 test in 3.780s 2022-08-17T13:55:04.4784233Z 2022-08-17T13:55:04.4784329Z OK 2022-08-17T13:55:04.4784463Z 2022-08-17T13:55:04.4784599Z Generating XML reports... 2022-08-17T13:55:04.4832812Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135500.xml 2022-08-17T13:55:06.2187490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:06.2188509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:06.2189542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:06.2190036Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:06.3859633Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-08-17T13:55:06.3874156Z 2022-08-17T13:55:06.3874303Z Running tests... 2022-08-17T13:55:06.3875506Z ---------------------------------------------------------------------- 2022-08-17T13:55:07.8635619Z test_collective_shape_mismatch_cuda (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:55:07.8815623Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 99242 2022-08-17T13:55:07.8821092Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 99243 2022-08-17T13:55:07.8827435Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 99244 2022-08-17T13:55:07.8833947Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 99245 2022-08-17T13:55:09.2649190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:09.2649689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:09.2651850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:09.2652331Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:09.2738313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:09.2738752Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:09.2742323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:09.2742815Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:09.3472881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:09.3473364Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:09.3475526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:09.3476000Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:09.3598865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:09.3599299Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:09.3603580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:09.3604065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:09.4322072Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:55:09.4400200Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:55:09.5195976Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:55:09.5346080Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:55:09.8894766Z skip: Need at least 4 CUDA devices (3.502s) 2022-08-17T13:55:09.8895024Z 2022-08-17T13:55:09.8895425Z ---------------------------------------------------------------------- 2022-08-17T13:55:09.8895754Z Ran 1 test in 3.502s 2022-08-17T13:55:09.8895918Z 2022-08-17T13:55:09.8896035Z OK (skipped=1) 2022-08-17T13:55:09.8896195Z 2022-08-17T13:55:09.8896324Z Generating XML reports... 2022-08-17T13:55:09.8943128Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135506.xml 2022-08-17T13:55:11.6676583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:11.6677129Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:11.6679833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:11.6680317Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:11.8418421Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-08-17T13:55:11.8434646Z 2022-08-17T13:55:11.8434788Z Running tests... 2022-08-17T13:55:11.8435221Z ---------------------------------------------------------------------- 2022-08-17T13:55:13.3454782Z test_collective_shape_mismatch_cuda_debug_mode (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:55:13.3642529Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 99413 2022-08-17T13:55:13.3649204Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 99414 2022-08-17T13:55:13.3655935Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 99415 2022-08-17T13:55:13.3662668Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 99416 2022-08-17T13:55:14.7683595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:14.7684096Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:14.7685411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:14.7685897Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:14.7746492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:14.7746949Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:14.7750871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:14.7751351Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:14.7842093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:14.7842809Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:14.7846262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:14.7846738Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:14.7897656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:14.7898276Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:14.7902900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:14.7903437Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:14.9359738Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:55:14.9427806Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:55:14.9507097Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:55:14.9622941Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:55:15.3724440Z skip: Need at least 4 CUDA devices (3.529s) 2022-08-17T13:55:15.3724715Z 2022-08-17T13:55:15.3725098Z ---------------------------------------------------------------------- 2022-08-17T13:55:15.3725736Z Ran 1 test in 3.529s 2022-08-17T13:55:15.3725902Z 2022-08-17T13:55:15.3726010Z OK (skipped=1) 2022-08-17T13:55:15.3726149Z 2022-08-17T13:55:15.3726286Z Generating XML reports... 2022-08-17T13:55:15.3773075Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135511.xml 2022-08-17T13:55:17.1553115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:17.1553616Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:17.1556265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:17.1556745Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:17.3316843Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-08-17T13:55:17.3332462Z 2022-08-17T13:55:17.3332607Z Running tests... 2022-08-17T13:55:17.3333260Z ---------------------------------------------------------------------- 2022-08-17T13:55:18.8583906Z test_collective_shape_mismatch_debug_mode (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:55:18.8774817Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 99584 2022-08-17T13:55:18.8781194Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 99585 2022-08-17T13:55:18.8788221Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 99586 2022-08-17T13:55:18.8795288Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 99587 2022-08-17T13:55:20.2837630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:20.2838609Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:20.2840095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:20.2841000Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:20.2842213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:20.2843087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:20.2845269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:20.2846204Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:20.3301401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:20.3302308Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:20.3305956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:20.3306890Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:20.4144375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:20.4145380Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:20.4147260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:20.4148180Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:20.4587888Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:55:20.4590133Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:55:20.4960755Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:55:20.5867953Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:55:20.6593524Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:55:20.6696074Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:55:20.6696676Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-08-17T13:55:20.6697164Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-08-17T13:55:20.6697920Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:55:20.6698613Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:55:20.6699295Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:55:20.6699986Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:55:20.7217049Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:55:20.7319141Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:55:20.7419890Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-08-17T13:55:20.7420524Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-08-17T13:55:20.7421300Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-08-17T13:55:20.7421995Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-08-17T13:55:20.7422687Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-08-17T13:55:20.7423688Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-08-17T13:55:21.2868482Z ok (3.953s) 2022-08-17T13:55:21.2868687Z 2022-08-17T13:55:21.2869093Z ---------------------------------------------------------------------- 2022-08-17T13:55:21.2869430Z Ran 1 test in 3.954s 2022-08-17T13:55:21.2869601Z 2022-08-17T13:55:21.2869696Z OK 2022-08-17T13:55:21.2869835Z 2022-08-17T13:55:21.2869970Z Generating XML reports... 2022-08-17T13:55:21.2915700Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135517.xml 2022-08-17T13:55:23.0478652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:23.0479176Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:23.0482002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:23.0482744Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:23.2220923Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-08-17T13:55:23.2236391Z 2022-08-17T13:55:23.2236608Z Running tests... 2022-08-17T13:55:23.2237061Z ---------------------------------------------------------------------- 2022-08-17T13:55:24.7490831Z test_collectives_op_mismatch (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:55:24.7678185Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 99803 2022-08-17T13:55:24.7684738Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 99804 2022-08-17T13:55:24.7691341Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 99805 2022-08-17T13:55:24.7698427Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 99806 2022-08-17T13:55:26.1656930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:26.1657625Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:26.1659018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:26.1659486Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:26.1670896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:26.1671351Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:26.1675268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:26.1675744Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:26.1793389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:26.1793847Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:26.1798285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:26.1798743Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:26.2108960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:26.2109419Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:26.2112995Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:26.2113458Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:26.3377946Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:55:26.3408248Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:55:26.3518906Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:55:26.3807302Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:55:26.3993039Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:55:26.4122742Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:55:26.4227504Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-08-17T13:55:26.4228358Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-08-17T13:55:26.4229363Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:55:26.4230273Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:55:26.4301621Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:55:26.4331629Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:55:26.9763463Z ok (3.752s) 2022-08-17T13:55:26.9763674Z 2022-08-17T13:55:26.9764081Z ---------------------------------------------------------------------- 2022-08-17T13:55:26.9764405Z Ran 1 test in 3.753s 2022-08-17T13:55:26.9764572Z 2022-08-17T13:55:26.9764673Z OK 2022-08-17T13:55:26.9764808Z 2022-08-17T13:55:26.9764942Z Generating XML reports... 2022-08-17T13:55:26.9812322Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135523.xml 2022-08-17T13:55:28.7245585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:28.7246178Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:28.7248336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:28.7248821Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:28.8926849Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-08-17T13:55:28.8941011Z 2022-08-17T13:55:28.8941439Z Running tests... 2022-08-17T13:55:28.8941930Z ---------------------------------------------------------------------- 2022-08-17T13:55:30.3812721Z test_collectives_op_mismatch_cuda (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:55:30.3993667Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 100010 2022-08-17T13:55:30.3999925Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 100011 2022-08-17T13:55:30.4006084Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 100012 2022-08-17T13:55:30.4012302Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 100013 2022-08-17T13:55:31.7930694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:31.7931204Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:31.7933426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:31.7933907Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:31.8134960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:31.8135412Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:31.8139267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:31.8139747Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:31.8634706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:31.8635165Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:31.8639280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:31.8639756Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:31.9010677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:31.9011151Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:31.9014881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:31.9015423Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:31.9596239Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:55:31.9791765Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:55:32.0373321Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:55:32.0709021Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:55:32.4074092Z skip: Need at least 4 CUDA devices (3.513s) 2022-08-17T13:55:32.4074494Z 2022-08-17T13:55:32.4075221Z ---------------------------------------------------------------------- 2022-08-17T13:55:32.4075546Z Ran 1 test in 3.513s 2022-08-17T13:55:32.4075707Z 2022-08-17T13:55:32.4075818Z OK (skipped=1) 2022-08-17T13:55:32.4075975Z 2022-08-17T13:55:32.4076109Z Generating XML reports... 2022-08-17T13:55:32.4121273Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135528.xml 2022-08-17T13:55:34.1439796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:34.1440310Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:34.1442714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:34.1443195Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:34.3184077Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-08-17T13:55:34.3199346Z 2022-08-17T13:55:34.3199754Z Running tests... 2022-08-17T13:55:34.3200239Z ---------------------------------------------------------------------- 2022-08-17T13:55:35.8460534Z test_collectives_op_mismatch_cuda_debug_mode (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:55:35.8653165Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 100181 2022-08-17T13:55:35.8659019Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 100182 2022-08-17T13:55:35.8664934Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 100183 2022-08-17T13:55:35.8671486Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 100184 2022-08-17T13:55:37.2616326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:37.2616848Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:37.2618492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:37.2618972Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:37.2625608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:37.2626076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:37.2630709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:37.2631174Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:37.2634368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:37.2634829Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:37.2638726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:37.2639195Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:37.2805833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:37.2806323Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:37.2809936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:37.2810411Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:37.4363718Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:55:37.4422506Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:55:37.4423075Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:55:37.4533085Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:55:37.7731464Z skip: Need at least 4 CUDA devices (3.453s) 2022-08-17T13:55:37.7732000Z 2022-08-17T13:55:37.7732633Z ---------------------------------------------------------------------- 2022-08-17T13:55:37.7732963Z Ran 1 test in 3.453s 2022-08-17T13:55:37.7733129Z 2022-08-17T13:55:37.7733237Z OK (skipped=1) 2022-08-17T13:55:37.7733396Z 2022-08-17T13:55:37.7733525Z Generating XML reports... 2022-08-17T13:55:37.7780286Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135534.xml 2022-08-17T13:55:39.5615681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:39.5616177Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:39.5619039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:39.5619548Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:39.7357669Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-08-17T13:55:39.7372903Z 2022-08-17T13:55:39.7373047Z Running tests... 2022-08-17T13:55:39.7373860Z ---------------------------------------------------------------------- 2022-08-17T13:55:41.2590999Z test_collectives_op_mismatch_debug_mode (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:55:41.2776848Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 100352 2022-08-17T13:55:41.2783583Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 100353 2022-08-17T13:55:41.2790464Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 100354 2022-08-17T13:55:41.2797006Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 100355 2022-08-17T13:55:42.6722023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:42.6722546Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:42.6724479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:42.6724970Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:42.6726633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:42.6727095Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:42.6731088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:42.6731564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:42.6947035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:42.6947490Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:42.6951502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:42.6952008Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:42.6967365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:42.6967814Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:42.6972660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:42.6973148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:42.8442419Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T13:55:42.8447510Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:55:42.8611691Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T13:55:42.8710704Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:55:42.9375670Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:55:42.9477474Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:55:42.9579422Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-08-17T13:55:42.9579922Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-08-17T13:55:42.9580687Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:55:42.9581398Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:55:42.9582102Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:55:42.9582776Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-08-17T13:55:43.0304240Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:55:43.0406192Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:55:43.0509202Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-08-17T13:55:43.0509921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-08-17T13:55:43.0510626Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-08-17T13:55:43.0511790Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-08-17T13:55:43.0610983Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-08-17T13:55:43.0611671Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-08-17T13:55:43.5866111Z ok (3.849s) 2022-08-17T13:55:43.5866320Z 2022-08-17T13:55:43.5866723Z ---------------------------------------------------------------------- 2022-08-17T13:55:43.5867045Z Ran 1 test in 3.849s 2022-08-17T13:55:43.5867211Z 2022-08-17T13:55:43.5867307Z OK 2022-08-17T13:55:43.5867445Z 2022-08-17T13:55:43.5867580Z Generating XML reports... 2022-08-17T13:55:43.5914932Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135539.xml 2022-08-17T13:55:45.3572175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:45.3572714Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:45.3574934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:45.3575405Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:45.5315283Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-08-17T13:55:45.5330291Z 2022-08-17T13:55:45.5330672Z Running tests... 2022-08-17T13:55:45.5331194Z ---------------------------------------------------------------------- 2022-08-17T13:55:47.0307676Z test_collective_hang (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:55:47.0486924Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 100571 2022-08-17T13:55:47.0493176Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 100572 2022-08-17T13:55:48.4828877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:48.4829819Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:48.4831053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:48.4831542Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:48.5196073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:48.5197024Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:48.5200674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:48.5201621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:48.6501304Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:55:48.6504351Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:55:48.6901495Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:55:48.6904795Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:55:48.6906304Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:55:48.6913174Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:55:48.7222914Z [E ProcessGroupGloo.cpp:2798] [Rank 0]: Rank 1 failed to pass monitoredBarrier in 2000 ms 2022-08-17T13:55:48.7223525Z [E ProcessGroupGloo.cpp:136] [Rank 0]: Ranks 1 failed to pass monitoredBarrier in 2000 ms 2022-08-17T13:55:49.0546830Z ok (3.521s) 2022-08-17T13:55:49.0547083Z 2022-08-17T13:55:49.0547489Z ---------------------------------------------------------------------- 2022-08-17T13:55:49.0547836Z Ran 1 test in 3.522s 2022-08-17T13:55:49.0548002Z 2022-08-17T13:55:49.0548098Z OK 2022-08-17T13:55:49.0548218Z 2022-08-17T13:55:49.0548359Z Generating XML reports... 2022-08-17T13:55:49.0594149Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20220817135545.xml 2022-08-17T13:55:50.8154458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:50.8154961Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:50.8157504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:50.8158235Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:50.9909762Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-08-17T13:55:50.9924497Z 2022-08-17T13:55:50.9924883Z Running tests... 2022-08-17T13:55:50.9925409Z ---------------------------------------------------------------------- 2022-08-17T13:55:52.5016149Z test_collective_shape_mismatch (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:55:52.5198049Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 100684 2022-08-17T13:55:52.5205014Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 100685 2022-08-17T13:55:53.8996815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:53.8998224Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:53.8999808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:53.9000726Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:53.9597072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:53.9598029Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:53.9600941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:53.9601911Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:54.0666242Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:55:54.0670085Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:55:54.1264879Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:55:54.1269162Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:55:54.1269915Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:55:54.1282537Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:55:56.0296856Z ok (5.037s) 2022-08-17T13:55:56.0297074Z 2022-08-17T13:55:56.0297462Z ---------------------------------------------------------------------- 2022-08-17T13:55:56.0297801Z Ran 1 test in 5.037s 2022-08-17T13:55:56.0297972Z 2022-08-17T13:55:56.0298049Z OK 2022-08-17T13:55:56.0298185Z 2022-08-17T13:55:56.0298324Z Generating XML reports... 2022-08-17T13:55:56.0343333Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20220817135550.xml 2022-08-17T13:55:57.7901081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:55:57.7901650Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:55:57.7904429Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:55:57.7904916Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:55:57.9638039Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-08-17T13:55:57.9653224Z 2022-08-17T13:55:57.9653487Z Running tests... 2022-08-17T13:55:57.9653916Z ---------------------------------------------------------------------- 2022-08-17T13:55:59.4897749Z test_collective_shape_mismatch_debug_mode (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:55:59.5087808Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 100813 2022-08-17T13:55:59.5094097Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 100814 2022-08-17T13:56:00.9055011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:00.9055570Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:00.9057807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:00.9058293Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:00.9173131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:00.9173588Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:00.9178029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:00.9178659Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:01.0756480Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:56:01.0886894Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:56:01.1070170Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:56:01.1070946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:56:01.1071906Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:01.1072603Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:01.1179085Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:56:01.1180004Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:56:01.1181063Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:56:01.1181784Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:56:03.0183724Z ok (5.053s) 2022-08-17T13:56:03.0183949Z 2022-08-17T13:56:03.0184352Z ---------------------------------------------------------------------- 2022-08-17T13:56:03.0184690Z Ran 1 test in 5.053s 2022-08-17T13:56:03.0184853Z 2022-08-17T13:56:03.0184949Z OK 2022-08-17T13:56:03.0185083Z 2022-08-17T13:56:03.0185204Z Generating XML reports... 2022-08-17T13:56:03.0231762Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20220817135557.xml 2022-08-17T13:56:04.7759451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:04.7759962Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:04.7762462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:04.7762948Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:04.9499119Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-08-17T13:56:04.9514920Z 2022-08-17T13:56:04.9515199Z Running tests... 2022-08-17T13:56:04.9515636Z ---------------------------------------------------------------------- 2022-08-17T13:56:06.4563254Z test_collectives_op_mismatch (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:56:06.4749752Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 100948 2022-08-17T13:56:06.4756456Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 100949 2022-08-17T13:56:07.8553168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:07.8553803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:07.8556564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:07.8557444Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:07.8880484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:07.8880954Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:07.8884761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:07.8885419Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:08.0209997Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:56:08.0213336Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:56:08.0613078Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:56:08.0617325Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:56:08.0618270Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:08.0621379Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:10.9871457Z ok (6.035s) 2022-08-17T13:56:10.9871662Z 2022-08-17T13:56:10.9872072Z ---------------------------------------------------------------------- 2022-08-17T13:56:10.9872396Z Ran 1 test in 6.036s 2022-08-17T13:56:10.9872559Z 2022-08-17T13:56:10.9872652Z OK 2022-08-17T13:56:10.9872784Z 2022-08-17T13:56:10.9872930Z Generating XML reports... 2022-08-17T13:56:10.9917581Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20220817135604.xml 2022-08-17T13:56:12.7516507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:12.7517009Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:12.7519254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:12.7519752Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:12.9248210Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-08-17T13:56:12.9262498Z 2022-08-17T13:56:12.9262696Z Running tests... 2022-08-17T13:56:12.9263206Z ---------------------------------------------------------------------- 2022-08-17T13:56:14.4486081Z test_collectives_op_mismatch_debug_mode (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:56:14.4674035Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 101078 2022-08-17T13:56:14.4680403Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 101079 2022-08-17T13:56:15.8089075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:15.8090003Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:15.8092844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:15.8093801Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:15.8679988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:15.8681293Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:15.8683769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:15.8684733Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:15.9767555Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:56:16.0391734Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:56:16.0604746Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:56:16.0605767Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:56:16.0607348Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:16.0608854Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:16.0816296Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-08-17T13:56:16.0817334Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-08-17T13:56:16.0818741Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:56:16.0820072Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-08-17T13:56:19.0804294Z ok (6.154s) 2022-08-17T13:56:19.0804499Z 2022-08-17T13:56:19.0804892Z ---------------------------------------------------------------------- 2022-08-17T13:56:19.0805554Z Ran 1 test in 6.154s 2022-08-17T13:56:19.0805755Z 2022-08-17T13:56:19.0805855Z OK 2022-08-17T13:56:19.0805973Z 2022-08-17T13:56:19.0806121Z Generating XML reports... 2022-08-17T13:56:19.0851072Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20220817135612.xml 2022-08-17T13:56:19.6476405Z Running distributed/fsdp/test_fsdp_misc ... [2022-08-17 13:56:19.647170] 2022-08-17T13:56:19.6477173Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_misc.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:56:19.647245] 2022-08-17T13:56:21.2498823Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_misc 2022-08-17T13:56:21.2516925Z 2022-08-17T13:56:21.2517177Z Running tests... 2022-08-17T13:56:21.2517617Z ---------------------------------------------------------------------- 2022-08-17T13:56:21.2525946Z test_cpu_init_with_sync_module_states (__main__.TestFSDPMisc) 2022-08-17T13:56:22.7813718Z Tests that passing ``sync_module_states=True`` raises an error for ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:56:22.7998269Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 101214 2022-08-17T13:56:22.8004668Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 101215 2022-08-17T13:56:24.2121284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:24.2121780Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:24.2124109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:24.2124599Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:24.2440896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:24.2441370Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:24.2445838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:24.2446526Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:24.3790590Z dist init r=1, world=2 2022-08-17T13:56:24.3794798Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:56:24.4163718Z dist init r=0, world=2 2022-08-17T13:56:24.4168650Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:56:24.4169380Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:24.4203152Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:25.8061283Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:56:25.8061786Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:56:25.8286217Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:56:25.8286994Z warnings.warn( 2022-08-17T13:56:25.8288115Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:56:25.8288872Z warnings.warn( 2022-08-17T13:56:26.2091863Z ok (4.957s) 2022-08-17T13:56:26.2097561Z test_device_id_auto_wrap (__main__.TestFSDPMisc) 2022-08-17T13:56:26.2111558Z Tests that ``auto_wrap_policy`` propagates ``device_id`` to all ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 101293 2022-08-17T13:56:26.2117428Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 101294 2022-08-17T13:56:27.6345339Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:27.6345847Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:27.6348194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:27.6348697Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:27.6568408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:27.6568887Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:27.6573173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:27.6573649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:27.8004735Z dist init r=1, world=2 2022-08-17T13:56:27.8008466Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:56:27.8298828Z dist init r=0, world=2 2022-08-17T13:56:27.8303468Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:56:27.8304354Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:27.8315048Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:29.2248105Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:56:29.2248707Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:56:29.7206301Z ok (3.511s) 2022-08-17T13:56:29.7214270Z test_fsdp_cpu_init_stays_on_cpu (__main__.TestFSDPMisc) 2022-08-17T13:56:29.7227641Z Tests that passing a CPU module to FSDP preserves that the wrapped ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 101372 2022-08-17T13:56:29.7233618Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 101373 2022-08-17T13:56:31.0884787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:31.0886068Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:31.0887645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:31.0888574Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:31.1601713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:31.1602672Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:31.1604444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:31.1605427Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:31.2568849Z dist init r=1, world=2 2022-08-17T13:56:31.2573278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:56:31.3312390Z dist init r=0, world=2 2022-08-17T13:56:31.3316935Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:56:31.3317967Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:31.3387493Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:32.7100398Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:56:32.7100908Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:56:32.7375775Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:56:32.7376789Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:56:32.7378112Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:56:32.7379095Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:56:33.6332975Z ok (3.913s) 2022-08-17T13:56:33.6341557Z test_fsdp_device_id_cpu_offload (__main__.TestFSDPMisc) 2022-08-17T13:56:33.6354864Z Ensures that even if device_id is specified but we have ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 101455 2022-08-17T13:56:33.6360792Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 101456 2022-08-17T13:56:35.0712542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:35.0713174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:35.0714735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:35.0715221Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:35.0926992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:35.0927681Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:35.0931694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:35.0932157Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:35.2383885Z dist init r=0, world=2 2022-08-17T13:56:35.2388253Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:56:35.2634658Z dist init r=1, world=2 2022-08-17T13:56:35.2639480Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:56:35.2640621Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:35.2695023Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:36.6512282Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:56:36.6512852Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:56:37.1449136Z ok (3.512s) 2022-08-17T13:56:37.1464020Z test_fsdp_device_id_use_index_False (__main__.TestFSDPMisc) 2022-08-17T13:56:37.1477634Z Tests the FSDP ``device_id`` argument: ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 101534 2022-08-17T13:56:37.1483586Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 101535 2022-08-17T13:56:38.5573658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:38.5574176Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:38.5576172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:38.5576657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:38.5881356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:38.5881817Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:38.5886189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:38.5886667Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:38.7242387Z dist init r=1, world=2 2022-08-17T13:56:38.7245467Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:56:38.7607734Z dist init r=0, world=2 2022-08-17T13:56:38.7612712Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:56:38.7613492Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:38.7653917Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:40.1407402Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:56:40.1407956Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:56:40.5569334Z ok (3.412s) 2022-08-17T13:56:40.5583446Z test_fsdp_device_id_use_index_True (__main__.TestFSDPMisc) 2022-08-17T13:56:40.5596897Z Tests the FSDP ``device_id`` argument: ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 101613 2022-08-17T13:56:40.5602925Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 101614 2022-08-17T13:56:41.9916027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:41.9916539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:41.9918859Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:41.9919340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:42.0166756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:42.0167207Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:42.0171641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:42.0172294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:42.1561299Z dist init r=1, world=2 2022-08-17T13:56:42.1565276Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:56:42.1883692Z dist init r=0, world=2 2022-08-17T13:56:42.1888629Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:56:42.1889352Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:42.1974257Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:43.5621852Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:56:43.5622375Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:56:43.9689382Z ok (3.412s) 2022-08-17T13:56:43.9715596Z test_fsdp_module_no_compute_grad_use_second_layer_False_sharding_strategy_None (__main__.TestFSDPMisc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 101692 2022-08-17T13:56:43.9721550Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 101693 2022-08-17T13:56:45.3893567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:45.3894190Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:45.3896999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:45.3897488Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:45.4080260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:45.4080719Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:45.4085286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:45.4085773Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:45.5548085Z dist init r=1, world=2 2022-08-17T13:56:45.5551858Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:56:45.5782667Z dist init r=0, world=2 2022-08-17T13:56:45.5787377Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:56:45.5788135Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:45.5858582Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:46.9580576Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:56:46.9581605Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:56:46.9836579Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:56:46.9837204Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:56:46.9837920Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:56:46.9838439Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:56:47.8819832Z ok (3.913s) 2022-08-17T13:56:47.8847486Z test_fsdp_module_no_compute_grad_use_second_layer_False_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestFSDPMisc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 101775 2022-08-17T13:56:47.8853262Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 101776 2022-08-17T13:56:49.3494502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:49.3495004Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:49.3497626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:49.3498094Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:49.3741034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:49.3741495Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:49.3745549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:49.3746015Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:49.5229851Z dist init r=0, world=2 2022-08-17T13:56:49.5233724Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:56:49.5425412Z dist init r=1, world=2 2022-08-17T13:56:49.5430159Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:56:49.5430945Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:49.5438628Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:50.9079739Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:56:50.9080255Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:56:50.9330287Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:56:50.9330851Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:56:50.9331559Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:56:50.9332103Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:56:51.7951102Z ok (3.913s) 2022-08-17T13:56:51.7977942Z test_fsdp_module_no_compute_grad_use_second_layer_True_sharding_strategy_None (__main__.TestFSDPMisc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 101858 2022-08-17T13:56:51.7984378Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 101859 2022-08-17T13:56:53.2933752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:53.2934779Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:53.2936251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:53.2936788Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:53.3030818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:53.3031295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:53.3036215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:53.3036697Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:53.4602538Z dist init r=0, world=2 2022-08-17T13:56:53.4607017Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:56:53.4741931Z dist init r=1, world=2 2022-08-17T13:56:53.4747474Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:56:53.4748840Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:53.4812842Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:54.8565386Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:56:54.8565915Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:56:54.8793425Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:56:54.8794008Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:56:54.8829768Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:56:54.8830333Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:56:55.7077271Z ok (3.913s) 2022-08-17T13:56:55.7102171Z test_fsdp_module_no_compute_grad_use_second_layer_True_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestFSDPMisc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 101941 2022-08-17T13:56:55.7108451Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 101942 2022-08-17T13:56:57.1124010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:57.1124980Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:57.1126506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:57.1127459Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:57.1692177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:56:57.1693144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:56:57.1698211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:56:57.1699212Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:56:57.2774491Z dist init r=0, world=2 2022-08-17T13:56:57.2778586Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:56:57.3413678Z dist init r=1, world=2 2022-08-17T13:56:57.3419067Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:56:57.3420422Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:57.3492427Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:56:58.6982484Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:56:58.6983860Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:56:58.7230224Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:56:58.7230790Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:56:58.7231496Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:56:58.7232369Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:56:59.6215081Z ok (3.914s) 2022-08-17T13:56:59.6236854Z test_fsdp_namedtuple (__main__.TestFSDPMisc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 102024 2022-08-17T13:56:59.6242615Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 102025 2022-08-17T13:57:01.0188531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:01.0189526Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:01.0191121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:01.0192057Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:01.0909944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:01.0910897Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:01.0915669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:01.0916632Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:01.1857247Z dist init r=0, world=2 2022-08-17T13:57:01.1861144Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:57:01.2637599Z dist init r=1, world=2 2022-08-17T13:57:01.2642488Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:57:01.2643907Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:01.2677215Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:02.6505171Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:57:02.6505752Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:57:02.6739984Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:57:02.6740977Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:57:02.6742196Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:57:02.6743198Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:57:03.1341235Z ok (3.513s) 2022-08-17T13:57:03.1352397Z test_fsdp_same_model_across_ranks (__main__.TestFSDPMisc) 2022-08-17T13:57:03.1365179Z FSDP broadcasts model from rank 0 to ensure it starts off with the same ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 102103 2022-08-17T13:57:03.1371582Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 102104 2022-08-17T13:57:04.5818851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:04.5819553Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:04.5821339Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:04.5821818Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:04.5837601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:04.5838335Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:04.5842325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:04.5842819Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:04.7504910Z dist init r=1, world=2 2022-08-17T13:57:04.7508690Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:57:04.7536472Z dist init r=0, world=2 2022-08-17T13:57:04.7540999Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:57:04.7542073Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:04.7612155Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:06.1196827Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:57:06.1197373Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:57:06.5454232Z ok (3.411s) 2022-08-17T13:57:06.5459114Z test_module_device_mismatches_device_id (__main__.TestFSDPMisc) 2022-08-17T13:57:06.5472773Z Tests that specifying a ``device_id`` argument to FSDP for a GPU ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 102182 2022-08-17T13:57:06.5478723Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 102183 2022-08-17T13:57:07.9898135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:07.9898640Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:07.9901434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:07.9901949Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:08.0441770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:08.0442242Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:08.0447215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:08.0447755Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:08.1569289Z dist init r=1, world=2 2022-08-17T13:57:08.1573395Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:57:08.2171746Z dist init r=0, world=2 2022-08-17T13:57:08.2176275Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:57:08.2177495Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:08.2185208Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:09.5812943Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:57:09.5813488Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:57:10.0577074Z ok (3.512s) 2022-08-17T13:57:10.0582909Z test_multi_device_not_supported (__main__.TestFSDPMisc) 2022-08-17T13:57:10.0597058Z Tests that wrapping a multi-device module (i.e. with submodules on ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 102261 2022-08-17T13:57:10.0602905Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 102262 2022-08-17T13:57:11.4786580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:11.4787838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:11.4790159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:11.4791091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:11.5139257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:11.5140189Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:11.5143998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:11.5145186Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:11.6472020Z dist init r=1, world=2 2022-08-17T13:57:11.6476581Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:57:11.6872725Z dist init r=0, world=2 2022-08-17T13:57:11.6877492Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:57:11.6878734Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:11.6885305Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:13.0760022Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:57:13.0760993Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:57:13.4687834Z ok (3.411s) 2022-08-17T13:57:13.4694700Z test_no_params (__main__.TestFSDPMisc) 2022-08-17T13:57:13.4708822Z Test that device_id and cpu init work if module has no params ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 102340 2022-08-17T13:57:13.4715072Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 102341 2022-08-17T13:57:14.8440394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:14.8440924Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:14.8443274Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:14.8443758Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:14.9081866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:14.9082340Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:14.9086548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:14.9087041Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:15.0129354Z dist init r=1, world=2 2022-08-17T13:57:15.0133311Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:57:15.0829220Z dist init r=0, world=2 2022-08-17T13:57:15.0829700Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:57:15.0830442Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:15.0846554Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:16.4662653Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:57:16.4663887Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:57:16.8799043Z ok (3.411s) 2022-08-17T13:57:16.8799222Z 2022-08-17T13:57:16.8799775Z ---------------------------------------------------------------------- 2022-08-17T13:57:16.8800102Z Ran 15 tests in 55.628s 2022-08-17T13:57:16.8800280Z 2022-08-17T13:57:16.8800375Z OK 2022-08-17T13:57:16.8800511Z 2022-08-17T13:57:16.8800649Z Generating XML reports... 2022-08-17T13:57:16.8853494Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_misc/TEST-TestFSDPMisc-20220817135621.xml 2022-08-17T13:57:17.2309044Z Running distributed/fsdp/test_fsdp_comm_hooks ... [2022-08-17 13:57:17.230411] 2022-08-17T13:57:17.2309949Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_comm_hooks.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:57:17.230483] 2022-08-17T13:57:18.8775576Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_comm_hooks 2022-08-17T13:57:18.8793490Z 2022-08-17T13:57:18.8793864Z Running tests... 2022-08-17T13:57:18.8794413Z ---------------------------------------------------------------------- 2022-08-17T13:57:20.3901048Z test_bf16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:57:20.4085428Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 102454 2022-08-17T13:57:20.4091854Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 102455 2022-08-17T13:57:21.8581616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:21.8582118Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:21.8585368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:21.8585838Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:21.8586442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:21.8586916Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:21.8590190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:21.8590658Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:22.0314520Z dist init r=1, world=2 2022-08-17T13:57:22.0318365Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:57:22.0383684Z dist init r=0, world=2 2022-08-17T13:57:22.0388690Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:57:22.0389659Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:22.0421328Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:23.4286427Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:57:23.4287302Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:57:23.4521661Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:57:23.4522241Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:57:23.4522926Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:57:23.4523471Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:57:24.3193335Z ok (5.440s) 2022-08-17T13:57:24.3214348Z test_bf16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 102537 2022-08-17T13:57:24.3220275Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 102538 2022-08-17T13:57:25.7775237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:25.7775755Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:25.7778064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:25.7778556Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:25.8156757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:25.8157230Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:25.8161394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:25.8161889Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:25.9814312Z dist init r=1, world=2 2022-08-17T13:57:25.9817420Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:57:25.9881285Z dist init r=0, world=2 2022-08-17T13:57:25.9886055Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:57:25.9887198Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:25.9920526Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:27.3593081Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:57:27.3593611Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:57:27.3912735Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:57:27.3913300Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:57:27.3914017Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:57:27.3914575Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:57:28.2321299Z ok (3.913s) 2022-08-17T13:57:28.2335274Z test_default_communication_hook_behavior_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-08-17T13:57:28.2349701Z Tests FSDP's default communication hook's behavior and correctness. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 102620 2022-08-17T13:57:28.2356243Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 102621 2022-08-17T13:57:29.6782773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:29.6783600Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:29.6785604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:29.6786094Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:29.7386747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:29.7387208Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:29.7391412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:29.7392100Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:29.8446670Z dist init r=0, world=2 2022-08-17T13:57:29.8450718Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:57:29.9109731Z dist init r=1, world=2 2022-08-17T13:57:29.9114255Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:57:29.9115456Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:29.9164245Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:31.2758543Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:57:31.2759076Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:57:31.2979345Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:57:31.2979954Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:57:31.2980669Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:57:31.2981217Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:57:32.1452644Z ok (3.913s) 2022-08-17T13:57:32.1466332Z test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) 2022-08-17T13:57:32.1480651Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 102703 2022-08-17T13:57:32.1486806Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 102704 2022-08-17T13:57:33.6278156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:33.6278667Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:33.6280992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:33.6281479Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:33.6744309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:33.6744757Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:33.6749265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:33.6749740Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:33.7933828Z dist init r=1, world=2 2022-08-17T13:57:33.7937662Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:57:33.8464044Z dist init r=0, world=2 2022-08-17T13:57:33.8469769Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:57:33.8470833Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:33.8549492Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:35.2269634Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:57:35.2270156Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:57:35.6574409Z ok (3.512s) 2022-08-17T13:57:35.6587387Z test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-08-17T13:57:35.6600242Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 102782 2022-08-17T13:57:35.6606093Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 102783 2022-08-17T13:57:37.1053315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:37.1053811Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:37.1056157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:37.1056637Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:37.1567880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:37.1568364Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:37.1572361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:37.1572856Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:37.2713939Z dist init r=1, world=2 2022-08-17T13:57:37.2718024Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:57:37.3306279Z dist init r=0, world=2 2022-08-17T13:57:37.3311060Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:57:37.3311854Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:37.3328545Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:38.7119831Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:57:38.7120369Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:57:39.1693530Z ok (3.512s) 2022-08-17T13:57:39.1706599Z test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) 2022-08-17T13:57:39.1719860Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 102861 2022-08-17T13:57:39.1725724Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 102862 2022-08-17T13:57:40.5856821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:40.5857318Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:40.5860217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:40.5860734Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:40.6115213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:40.6115692Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:40.6119753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:40.6120237Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:40.7515133Z dist init r=0, world=2 2022-08-17T13:57:40.7519247Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:57:40.7825263Z dist init r=1, world=2 2022-08-17T13:57:40.7830749Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:57:40.7831762Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:40.7927677Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:42.1580079Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:57:42.1580613Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:57:42.5811312Z ok (3.412s) 2022-08-17T13:57:42.5824145Z test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) 2022-08-17T13:57:42.5838223Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 102940 2022-08-17T13:57:42.5843996Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 102941 2022-08-17T13:57:43.9663917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:43.9664613Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:43.9667271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:43.9667760Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:43.9837432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:43.9837882Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:43.9842166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:43.9842649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:44.1325551Z dist init r=1, world=2 2022-08-17T13:57:44.1328933Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:57:44.1566844Z dist init r=0, world=2 2022-08-17T13:57:44.1570974Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:57:44.1572166Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:44.1636354Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:45.5439882Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:57:45.5440404Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:57:45.9928136Z ok (3.412s) 2022-08-17T13:57:45.9941015Z test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-08-17T13:57:45.9954945Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 103019 2022-08-17T13:57:45.9961073Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 103020 2022-08-17T13:57:47.4408015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:47.4408544Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:47.4411125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:47.4411721Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:47.4490951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:47.4491444Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:47.4495393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:47.4496058Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:47.6079154Z dist init r=0, world=2 2022-08-17T13:57:47.6082798Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:57:47.6182283Z dist init r=1, world=2 2022-08-17T13:57:47.6187375Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:57:47.6188306Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:47.6288549Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:48.9875685Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:57:48.9876227Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:57:49.4045348Z ok (3.412s) 2022-08-17T13:57:49.4058317Z test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) 2022-08-17T13:57:49.4072569Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 103098 2022-08-17T13:57:49.4078256Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 103099 2022-08-17T13:57:50.8164285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:50.8164799Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:50.8166469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:50.8167091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:50.8511063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:50.8511562Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:50.8515681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:50.8516178Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:50.9825266Z dist init r=1, world=2 2022-08-17T13:57:50.9829113Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:57:51.0250867Z dist init r=0, world=2 2022-08-17T13:57:51.0255772Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:57:51.0256816Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:51.0339640Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:52.4088416Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:57:52.4088946Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:57:52.8162919Z ok (3.412s) 2022-08-17T13:57:52.8181445Z test_fp16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 103177 2022-08-17T13:57:52.8187461Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 103178 2022-08-17T13:57:54.2390011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:54.2390784Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:54.2392144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:54.2392616Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:54.2606575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:54.2607049Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:54.2611454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:54.2611933Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:54.4046344Z dist init r=1, world=2 2022-08-17T13:57:54.4050006Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:57:54.4322404Z dist init r=0, world=2 2022-08-17T13:57:54.4327030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:57:54.4327928Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:54.4356616Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:55.8223173Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:57:55.8224006Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:57:55.8483389Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:57:55.8485383Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:57:55.8486477Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:57:55.8487059Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:57:56.7285911Z ok (3.912s) 2022-08-17T13:57:56.7304985Z test_fp16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 103260 2022-08-17T13:57:56.7311148Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 103261 2022-08-17T13:57:58.2202538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:58.2203055Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:58.2205552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:58.2206064Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:58.2255547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:57:58.2256037Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:57:58.2260302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:57:58.2260781Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:57:58.3869465Z dist init r=0, world=2 2022-08-17T13:57:58.3873075Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:57:58.3967697Z dist init r=1, world=2 2022-08-17T13:57:58.3972361Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:57:58.3973438Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:58.3976779Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:57:59.7815113Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:57:59.7815616Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:57:59.8078120Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:57:59.8078698Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:57:59.8107806Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:57:59.8108388Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:58:00.6409981Z ok (3.912s) 2022-08-17T13:58:00.6415557Z test_registering_hook_non_root_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-08-17T13:58:00.6430336Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 103343 2022-08-17T13:58:00.6435975Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 103344 2022-08-17T13:58:02.0884480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:02.0885014Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:02.0887278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:02.0887774Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:02.1088607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:02.1089114Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:02.1093179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:02.1093672Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:02.2551108Z dist init r=1, world=2 2022-08-17T13:58:02.2554751Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:58:02.2816914Z dist init r=0, world=2 2022-08-17T13:58:02.2821308Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:58:02.2822446Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:58:02.2861292Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:58:03.6885655Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:58:03.6886239Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:58:04.1523914Z ok (3.511s) 2022-08-17T13:58:04.1533617Z test_registering_hook_submodules_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-08-17T13:58:04.1549196Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 103422 2022-08-17T13:58:04.1555249Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 103423 2022-08-17T13:58:05.5424059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:05.5425336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:05.5428409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:05.5429346Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:05.5543215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:05.5544427Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:05.5548101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:05.5549074Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:05.7161937Z dist init r=1, world=2 2022-08-17T13:58:05.7166025Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:58:05.7259450Z dist init r=0, world=2 2022-08-17T13:58:05.7264237Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:58:05.7265092Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:58:05.7269109Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:58:07.1204336Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:58:07.1204865Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:58:07.5641201Z ok (3.412s) 2022-08-17T13:58:07.5641575Z 2022-08-17T13:58:07.5642215Z ---------------------------------------------------------------------- 2022-08-17T13:58:07.5642800Z Ran 13 tests in 48.685s 2022-08-17T13:58:07.5643132Z 2022-08-17T13:58:07.5644685Z OK 2022-08-17T13:58:07.5645005Z 2022-08-17T13:58:07.5645258Z Generating XML reports... 2022-08-17T13:58:07.5696609Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_comm_hooks/TEST-TestCommunicationHooks-20220817135718.xml 2022-08-17T13:58:07.9316888Z Running distributed/test_c10d_spawn_nccl ... [2022-08-17 13:58:07.931190] 2022-08-17T13:58:07.9317673Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_spawn_nccl.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:58:07.931265] 2022-08-17T13:58:09.4716279Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpov639f2h 2022-08-17T13:58:09.4717274Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpov639f2h/_remote_module_non_scriptable.py 2022-08-17T13:58:10.9891703Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:10.9940515Z 2022-08-17T13:58:10.9941088Z 2022-08-17T13:58:10.9942895Z , <__main__.TestDistributedNNFunctionsNccl testMethod=test_all_gather_base>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_all_to_all>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_all_to_all_single>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_allreduce>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_broadcast>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_reduce>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_reduce_scatter>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_reduce_scatter_non_contiguous>]> 2022-08-17T13:58:10.9945203Z test_all_gather (__main__.TestDistributedNNFunctionsNccl) 2022-08-17T13:58:10.9946131Z test_all_gather_base (__main__.TestDistributedNNFunctionsNccl) 2022-08-17T13:58:10.9946975Z test_all_to_all (__main__.TestDistributedNNFunctionsNccl) 2022-08-17T13:58:10.9947394Z test_all_to_all_single (__main__.TestDistributedNNFunctionsNccl) 2022-08-17T13:58:10.9947961Z test_allreduce (__main__.TestDistributedNNFunctionsNccl) 2022-08-17T13:58:10.9948360Z test_broadcast (__main__.TestDistributedNNFunctionsNccl) 2022-08-17T13:58:10.9948755Z test_reduce (__main__.TestDistributedNNFunctionsNccl) 2022-08-17T13:58:10.9949136Z test_reduce_scatter (__main__.TestDistributedNNFunctionsNccl) 2022-08-17T13:58:10.9949574Z test_reduce_scatter_non_contiguous (__main__.TestDistributedNNFunctionsNccl) 2022-08-17T13:58:12.3726854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:12.3727791Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:12.3730769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:12.3731718Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:12.5457847Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkofdnijn 2022-08-17T13:58:12.5459918Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkofdnijn/_remote_module_non_scriptable.py 2022-08-17T13:58:14.0579519Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:14.0651005Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-08-17T13:58:14.0668233Z 2022-08-17T13:58:14.0668719Z Running tests... 2022-08-17T13:58:14.0669227Z ---------------------------------------------------------------------- 2022-08-17T13:58:14.0829309Z test_all_gather (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 103570 2022-08-17T13:58:14.0835823Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 103571 2022-08-17T13:58:15.5317573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:15.5318550Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:15.5320200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:15.5321188Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:15.5496869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:15.5497801Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:15.5503091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:15.5504328Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:15.7015605Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz0ztehqy 2022-08-17T13:58:15.7017077Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz0ztehqy/_remote_module_non_scriptable.py 2022-08-17T13:58:15.7226281Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmd_pa_gi 2022-08-17T13:58:15.7230491Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmd_pa_gi/_remote_module_non_scriptable.py 2022-08-17T13:58:17.2808561Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:17.2855818Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:58:17.2859241Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:58:17.2891206Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:17.2939529Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:58:17.2943751Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:58:17.2945346Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:58:17.2962771Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:58:18.9959432Z ok (4.929s) 2022-08-17T13:58:18.9959638Z 2022-08-17T13:58:18.9960049Z ---------------------------------------------------------------------- 2022-08-17T13:58:18.9960375Z Ran 1 test in 4.929s 2022-08-17T13:58:18.9960555Z 2022-08-17T13:58:18.9960653Z OK 2022-08-17T13:58:18.9960795Z 2022-08-17T13:58:18.9960930Z Generating XML reports... 2022-08-17T13:58:19.0006008Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135814.xml 2022-08-17T13:58:20.7536308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:20.7536820Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:20.7538843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:20.7539353Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:20.9260087Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyuu0gyyz 2022-08-17T13:58:20.9262359Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyuu0gyyz/_remote_module_non_scriptable.py 2022-08-17T13:58:22.4375685Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:22.4444444Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-08-17T13:58:22.4460397Z 2022-08-17T13:58:22.4460845Z Running tests... 2022-08-17T13:58:22.4461343Z ---------------------------------------------------------------------- 2022-08-17T13:58:22.4629153Z test_all_gather_base (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 103689 2022-08-17T13:58:22.4635172Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 103690 2022-08-17T13:58:23.8346319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:23.8346819Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:23.8349322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:23.8349816Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:23.8463928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:23.8464377Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:23.8468628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:23.8469109Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:23.9985835Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx43pq2e5 2022-08-17T13:58:23.9988759Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx43pq2e5/_remote_module_non_scriptable.py 2022-08-17T13:58:24.0111835Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpal3baace 2022-08-17T13:58:24.0114214Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpal3baace/_remote_module_non_scriptable.py 2022-08-17T13:58:25.5589594Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:25.5629098Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:25.5636953Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:58:25.5640510Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:58:25.5674020Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:58:25.5677559Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:58:25.5678574Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:58:25.5743940Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:58:27.1753326Z ok (4.729s) 2022-08-17T13:58:27.1753673Z 2022-08-17T13:58:27.1754069Z ---------------------------------------------------------------------- 2022-08-17T13:58:27.1754398Z Ran 1 test in 4.729s 2022-08-17T13:58:27.1754564Z 2022-08-17T13:58:27.1754659Z OK 2022-08-17T13:58:27.1754795Z 2022-08-17T13:58:27.1754935Z Generating XML reports... 2022-08-17T13:58:27.1800575Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135822.xml 2022-08-17T13:58:28.9320587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:28.9321074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:28.9324085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:28.9324551Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:29.1028831Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqip5r3pf 2022-08-17T13:58:29.1031069Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqip5r3pf/_remote_module_non_scriptable.py 2022-08-17T13:58:30.6039149Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:30.6105780Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-08-17T13:58:30.6121199Z 2022-08-17T13:58:30.6121448Z Running tests... 2022-08-17T13:58:30.6121861Z ---------------------------------------------------------------------- 2022-08-17T13:58:30.6274445Z test_all_to_all (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 103808 2022-08-17T13:58:30.6280580Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 103809 2022-08-17T13:58:32.0151484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:32.0151967Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:32.0154620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:32.0155105Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:32.0408847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:32.0409300Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:32.0413314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:32.0414078Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:32.1799993Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdkeicms3 2022-08-17T13:58:32.1802236Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdkeicms3/_remote_module_non_scriptable.py 2022-08-17T13:58:32.2134233Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1sjk1px0 2022-08-17T13:58:32.2136755Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1sjk1px0/_remote_module_non_scriptable.py 2022-08-17T13:58:33.7342985Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:33.7390480Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:58:33.7394022Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:58:33.7678200Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:33.7725559Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:58:33.7729574Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:58:33.7730366Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:58:33.7802451Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:58:35.3401271Z ok (4.728s) 2022-08-17T13:58:35.3401505Z 2022-08-17T13:58:35.3401911Z ---------------------------------------------------------------------- 2022-08-17T13:58:35.3402251Z Ran 1 test in 4.728s 2022-08-17T13:58:35.3402419Z 2022-08-17T13:58:35.3402514Z OK 2022-08-17T13:58:35.3402631Z 2022-08-17T13:58:35.3402779Z Generating XML reports... 2022-08-17T13:58:35.3446653Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135830.xml 2022-08-17T13:58:37.0906245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:37.0906747Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:37.0909447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:37.0909934Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:37.2624342Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv24r00ps 2022-08-17T13:58:37.2627115Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv24r00ps/_remote_module_non_scriptable.py 2022-08-17T13:58:38.7772564Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:38.7840965Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-08-17T13:58:38.7856465Z 2022-08-17T13:58:38.7856741Z Running tests... 2022-08-17T13:58:38.7857169Z ---------------------------------------------------------------------- 2022-08-17T13:58:38.8016706Z test_all_to_all_single (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 103929 2022-08-17T13:58:38.8022878Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 103930 2022-08-17T13:58:40.2009706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:40.2010222Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:40.2012582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:40.2013109Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:40.2355455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:40.2356229Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:40.2359708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:40.2360178Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:40.3716359Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0ws5n4mr 2022-08-17T13:58:40.3718695Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0ws5n4mr/_remote_module_non_scriptable.py 2022-08-17T13:58:40.4007228Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxnfs3oei 2022-08-17T13:58:40.4009903Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxnfs3oei/_remote_module_non_scriptable.py 2022-08-17T13:58:41.9312036Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:41.9356550Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:41.9360483Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:58:41.9364509Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:58:41.9403099Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:58:41.9406726Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:58:41.9407838Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:58:41.9467219Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:58:43.6146293Z ok (4.828s) 2022-08-17T13:58:43.6146546Z 2022-08-17T13:58:43.6146988Z ---------------------------------------------------------------------- 2022-08-17T13:58:43.6147337Z Ran 1 test in 4.829s 2022-08-17T13:58:43.6147508Z 2022-08-17T13:58:43.6147605Z OK 2022-08-17T13:58:43.6147723Z 2022-08-17T13:58:43.6147871Z Generating XML reports... 2022-08-17T13:58:43.6192282Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135838.xml 2022-08-17T13:58:45.3940422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:45.3940914Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:45.3942353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:45.3942833Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:45.5668347Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptr910gvi 2022-08-17T13:58:45.5669658Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptr910gvi/_remote_module_non_scriptable.py 2022-08-17T13:58:47.0770015Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:47.0838896Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-08-17T13:58:47.0854663Z 2022-08-17T13:58:47.0855071Z Running tests... 2022-08-17T13:58:47.0855532Z ---------------------------------------------------------------------- 2022-08-17T13:58:47.1014754Z test_allreduce (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104050 2022-08-17T13:58:47.1020914Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104051 2022-08-17T13:58:48.5059637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:48.5060125Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:48.5062950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:48.5063723Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:48.5156751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:48.5157190Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:48.5161396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:48.5161870Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:48.6724612Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoxk9xpel 2022-08-17T13:58:48.6726846Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoxk9xpel/_remote_module_non_scriptable.py 2022-08-17T13:58:48.6882781Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptfeqadvd 2022-08-17T13:58:48.6885313Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptfeqadvd/_remote_module_non_scriptable.py 2022-08-17T13:58:50.2316207Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:50.2363707Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:58:50.2367191Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:58:50.2536823Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:50.2583924Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:58:50.2587910Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:58:50.2588956Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:58:50.2674225Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:58:51.8141299Z ok (4.728s) 2022-08-17T13:58:51.8141545Z 2022-08-17T13:58:51.8141957Z ---------------------------------------------------------------------- 2022-08-17T13:58:51.8142310Z Ran 1 test in 4.729s 2022-08-17T13:58:51.8142478Z 2022-08-17T13:58:51.8142572Z OK 2022-08-17T13:58:51.8142709Z 2022-08-17T13:58:51.8142826Z Generating XML reports... 2022-08-17T13:58:51.8186844Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135847.xml 2022-08-17T13:58:53.5747400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:53.5747952Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:53.5750574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:53.5751081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:53.7452905Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpczes64y9 2022-08-17T13:58:53.7455421Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpczes64y9/_remote_module_non_scriptable.py 2022-08-17T13:58:55.2646100Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:55.2714865Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-08-17T13:58:55.2730431Z 2022-08-17T13:58:55.2730853Z Running tests... 2022-08-17T13:58:55.2731335Z ---------------------------------------------------------------------- 2022-08-17T13:58:55.2891814Z test_broadcast (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104169 2022-08-17T13:58:55.2898010Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104170 2022-08-17T13:58:56.6661823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:56.6662371Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:56.6665219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:56.6665725Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:56.6869664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:58:56.6870143Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:58:56.6874281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:58:56.6874762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:58:56.8338938Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqvbsrzgz 2022-08-17T13:58:56.8341489Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqvbsrzgz/_remote_module_non_scriptable.py 2022-08-17T13:58:56.8560635Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_8zwi1o9 2022-08-17T13:58:56.8563428Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_8zwi1o9/_remote_module_non_scriptable.py 2022-08-17T13:58:58.4043634Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:58.4091171Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:58:58.4094847Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:58:58.4112452Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:58:58.4159468Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:58:58.4162898Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:58:58.4164116Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:58:58.4198190Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:00.1019485Z ok (4.828s) 2022-08-17T13:59:00.1019839Z 2022-08-17T13:59:00.1020474Z ---------------------------------------------------------------------- 2022-08-17T13:59:00.1021036Z Ran 1 test in 4.829s 2022-08-17T13:59:00.1021321Z 2022-08-17T13:59:00.1021485Z OK 2022-08-17T13:59:00.1022115Z 2022-08-17T13:59:00.1022567Z Generating XML reports... 2022-08-17T13:59:00.1066913Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135855.xml 2022-08-17T13:59:01.8628054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:01.8628569Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:01.8630893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:01.8631395Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:02.0359554Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2i5pdaq9 2022-08-17T13:59:02.0361395Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2i5pdaq9/_remote_module_non_scriptable.py 2022-08-17T13:59:03.5463589Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:59:03.5532125Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-08-17T13:59:03.5548200Z 2022-08-17T13:59:03.5548431Z Running tests... 2022-08-17T13:59:03.5548844Z ---------------------------------------------------------------------- 2022-08-17T13:59:03.5709087Z test_reduce (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104288 2022-08-17T13:59:03.5715404Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104289 2022-08-17T13:59:05.0066424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:05.0066956Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:05.0069881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:05.0070367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:05.0197627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:05.0198097Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:05.0201919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:05.0202674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:05.1795476Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp0inftml 2022-08-17T13:59:05.1797428Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp0inftml/_remote_module_non_scriptable.py 2022-08-17T13:59:05.1869086Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4gbs19sb 2022-08-17T13:59:05.1871781Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4gbs19sb/_remote_module_non_scriptable.py 2022-08-17T13:59:06.7532097Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:59:06.7580455Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:59:06.7584186Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:59:06.7604955Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:59:06.7650199Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:59:06.7653645Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:59:06.7654553Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:06.7687722Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:08.2837863Z ok (4.728s) 2022-08-17T13:59:08.2838078Z 2022-08-17T13:59:08.2838463Z ---------------------------------------------------------------------- 2022-08-17T13:59:08.2838802Z Ran 1 test in 4.729s 2022-08-17T13:59:08.2838952Z 2022-08-17T13:59:08.2839049Z OK 2022-08-17T13:59:08.2839192Z 2022-08-17T13:59:08.2839327Z Generating XML reports... 2022-08-17T13:59:08.2884070Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135903.xml 2022-08-17T13:59:10.0571625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:10.0572134Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:10.0574548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:10.0575031Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:10.2297336Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprwba6o9z 2022-08-17T13:59:10.2300272Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprwba6o9z/_remote_module_non_scriptable.py 2022-08-17T13:59:11.7484134Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:59:11.7552587Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-08-17T13:59:11.7569018Z 2022-08-17T13:59:11.7569249Z Running tests... 2022-08-17T13:59:11.7569681Z ---------------------------------------------------------------------- 2022-08-17T13:59:11.7741872Z test_reduce_scatter (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104407 2022-08-17T13:59:11.7748303Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104408 2022-08-17T13:59:13.1752209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:13.1752701Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:13.1755382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:13.1755873Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:13.1992930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:13.1993379Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:13.1997678Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:13.1998158Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:13.3437760Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm44__6cz 2022-08-17T13:59:13.3439949Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm44__6cz/_remote_module_non_scriptable.py 2022-08-17T13:59:13.3713349Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjrmz47iu 2022-08-17T13:59:13.3716108Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjrmz47iu/_remote_module_non_scriptable.py 2022-08-17T13:59:14.8907329Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:59:14.8955161Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:59:14.8958310Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:59:14.9153067Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:59:14.9200386Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:59:14.9204141Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:59:14.9205309Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:14.9265412Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:16.4865598Z ok (4.729s) 2022-08-17T13:59:16.4865982Z 2022-08-17T13:59:16.4866736Z ---------------------------------------------------------------------- 2022-08-17T13:59:16.4867193Z Ran 1 test in 4.730s 2022-08-17T13:59:16.4867363Z 2022-08-17T13:59:16.4867459Z OK 2022-08-17T13:59:16.4867598Z 2022-08-17T13:59:16.4868051Z Generating XML reports... 2022-08-17T13:59:16.4911185Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135911.xml 2022-08-17T13:59:18.2471210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:18.2471757Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:18.2474315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:18.2474807Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:18.4183185Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp70qub14q 2022-08-17T13:59:18.4184995Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp70qub14q/_remote_module_non_scriptable.py 2022-08-17T13:59:19.9297083Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:59:19.9367066Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-08-17T13:59:19.9382718Z 2022-08-17T13:59:19.9383241Z Running tests... 2022-08-17T13:59:19.9383727Z ---------------------------------------------------------------------- 2022-08-17T13:59:19.9550872Z test_reduce_scatter_non_contiguous (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104526 2022-08-17T13:59:19.9557382Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104527 2022-08-17T13:59:21.3448326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:21.3449331Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:21.3451053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:21.3451566Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:21.3654451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:21.3654984Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:21.3659078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:21.3659582Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:21.5100619Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo7s3a8i9 2022-08-17T13:59:21.5102854Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo7s3a8i9/_remote_module_non_scriptable.py 2022-08-17T13:59:21.5359378Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1ytcaksa 2022-08-17T13:59:21.5362414Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1ytcaksa/_remote_module_non_scriptable.py 2022-08-17T13:59:23.0593793Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:59:23.0641857Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:59:23.0645699Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:59:23.0763672Z INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:59:23.0809365Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:59:23.0812960Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:59:23.0814169Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:23.0851021Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:24.6678928Z ok (4.729s) 2022-08-17T13:59:24.6679238Z 2022-08-17T13:59:24.6679914Z ---------------------------------------------------------------------- 2022-08-17T13:59:24.6680395Z Ran 1 test in 4.729s 2022-08-17T13:59:24.6680563Z 2022-08-17T13:59:24.6680660Z OK 2022-08-17T13:59:24.6680809Z 2022-08-17T13:59:24.6680932Z Generating XML reports... 2022-08-17T13:59:24.6726087Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135919.xml 2022-08-17T13:59:25.3617780Z Running distributed/fsdp/test_fsdp_freezing_weights ... [2022-08-17 13:59:25.361305] 2022-08-17T13:59:25.3618578Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_freezing_weights.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 13:59:25.361383] 2022-08-17T13:59:26.9424713Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_freezing_weights 2022-08-17T13:59:26.9440629Z 2022-08-17T13:59:26.9440796Z Running tests... 2022-08-17T13:59:26.9441235Z ---------------------------------------------------------------------- 2022-08-17T13:59:28.4095913Z test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False (__main__.TestFreezingWeights) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T13:59:28.4273756Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104645 2022-08-17T13:59:28.4280838Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104646 2022-08-17T13:59:29.8482822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:29.8483319Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:29.8485273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:29.8486039Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:29.8766280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:29.8766757Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:29.8770889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:29.8771357Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:30.0144120Z dist init r=1, world=2 2022-08-17T13:59:30.0148161Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:59:30.0509282Z dist init r=0, world=2 2022-08-17T13:59:30.0513884Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:59:30.0515020Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:30.0556872Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:31.4316485Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:59:31.4317026Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:59:31.4617411Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:59:31.4617982Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:59:31.4618664Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:59:31.4619225Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:59:32.6726218Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:59:32.6726781Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:59:33.3404347Z ok (6.396s) 2022-08-17T13:59:33.3423524Z test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104728 2022-08-17T13:59:33.3429129Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104729 2022-08-17T13:59:34.8486919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:34.8487433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:34.8490137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:34.8490629Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:34.8672539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:34.8673032Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:34.8676991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:34.8677479Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:35.0212176Z dist init r=1, world=2 2022-08-17T13:59:35.0216357Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:59:35.0392014Z dist init r=0, world=2 2022-08-17T13:59:35.0396696Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:59:35.0397742Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:35.0421313Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:36.4219066Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:59:36.4219644Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:59:36.4490871Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:59:36.4491451Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:59:36.4492153Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:59:36.4492696Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:59:37.6681336Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:59:37.6681884Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:59:37.6784900Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:59:37.6785682Z warnings.warn( 2022-08-17T13:59:37.6786797Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:59:37.6787559Z warnings.warn( 2022-08-17T13:59:38.3559186Z ok (5.015s) 2022-08-17T13:59:38.3578137Z test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104811 2022-08-17T13:59:38.3584309Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104812 2022-08-17T13:59:39.8290761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:39.8291255Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:39.8293171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:39.8293883Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:39.8657494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:39.8657966Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:39.8662221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:39.8662703Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:39.9946710Z dist init r=1, world=2 2022-08-17T13:59:39.9950669Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:59:40.0366669Z dist init r=0, world=2 2022-08-17T13:59:40.0371012Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:59:40.0371737Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:40.0461109Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:41.4059066Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:59:41.4059590Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:59:41.4330334Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:59:41.4330927Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:59:41.4331886Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:59:41.4332434Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:59:42.5724163Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:59:42.5724699Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:59:43.1698136Z ok (4.814s) 2022-08-17T13:59:43.1717942Z test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104894 2022-08-17T13:59:43.1723708Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104895 2022-08-17T13:59:44.6179560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:44.6180478Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:44.6182025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:44.6182831Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:44.6441521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:44.6442310Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:44.6446027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:44.6446807Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:44.7828449Z dist init r=0, world=2 2022-08-17T13:59:44.7831971Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:59:44.8166360Z dist init r=1, world=2 2022-08-17T13:59:44.8170817Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:59:44.8172093Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:44.8240443Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:46.1845992Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:59:46.1846530Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:59:46.2141518Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:59:46.2142379Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:59:46.2143687Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:59:46.2144520Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:59:47.3532667Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:59:47.3534936Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:59:47.3609149Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:59:47.3609985Z warnings.warn( 2022-08-17T13:59:47.3612002Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:59:47.3612828Z warnings.warn( 2022-08-17T13:59:47.9840969Z ok (4.814s) 2022-08-17T13:59:47.9861326Z test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 104977 2022-08-17T13:59:47.9868052Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 104978 2022-08-17T13:59:49.4159234Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:49.4159733Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:49.4162856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:49.4163348Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:49.4436652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:49.4437100Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:49.4441419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:49.4441900Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:49.5812681Z dist init r=1, world=2 2022-08-17T13:59:49.5816906Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:59:49.6154506Z dist init r=0, world=2 2022-08-17T13:59:49.6159331Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:59:49.6160289Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:49.6225266Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:51.0005549Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:59:51.0006072Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:59:51.0297416Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:59:51.0297973Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:59:51.0299025Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:59:51.0299628Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:59:52.2309718Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:59:52.2310263Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:59:52.8987454Z ok (4.915s) 2022-08-17T13:59:52.9006816Z test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105060 2022-08-17T13:59:52.9012414Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105061 2022-08-17T13:59:54.2908266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:54.2908777Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:54.2911605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:54.2912090Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:54.3569044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:54.3569509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:54.3573589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:54.3574066Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:54.4600549Z dist init r=0, world=2 2022-08-17T13:59:54.4604358Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:59:54.5293525Z dist init r=1, world=2 2022-08-17T13:59:54.5297915Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:59:54.5299326Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:54.5317133Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:55.9072716Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T13:59:55.9073252Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T13:59:55.9380464Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:59:55.9381051Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:59:55.9382056Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T13:59:55.9382635Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T13:59:57.1671392Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:59:57.1671947Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T13:59:57.1949199Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:59:57.1950289Z warnings.warn( 2022-08-17T13:59:57.1951428Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T13:59:57.1952171Z warnings.warn( 2022-08-17T13:59:57.9137460Z ok (5.015s) 2022-08-17T13:59:57.9157987Z test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105173 2022-08-17T13:59:57.9163704Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105174 2022-08-17T13:59:59.3426289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:59.3426852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:59.3429982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:59.3430472Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:59.3715694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T13:59:59.3716169Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T13:59:59.3720414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T13:59:59.3720887Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T13:59:59.5097263Z dist init r=0, world=2 2022-08-17T13:59:59.5101011Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T13:59:59.5441435Z dist init r=1, world=2 2022-08-17T13:59:59.5446122Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T13:59:59.5446903Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T13:59:59.5509352Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:00.9074828Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:00:00.9075358Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:00:00.9344476Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:00.9345046Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:00.9346006Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:00.9346581Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:02.0682639Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T14:00:02.0683244Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T14:00:02.7280743Z ok (4.814s) 2022-08-17T14:00:02.7300935Z test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105256 2022-08-17T14:00:02.7307207Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105257 2022-08-17T14:00:04.1594373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:04.1596847Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:04.1597448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:04.1598140Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:04.1919936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:04.1920384Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:04.1924815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:04.1925304Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:04.3251853Z dist init r=0, world=2 2022-08-17T14:00:04.3255360Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:00:04.3648455Z dist init r=1, world=2 2022-08-17T14:00:04.3653463Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:00:04.3654613Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:04.3663593Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:05.7415416Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:00:05.7416051Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:00:05.7666088Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:05.7666770Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:05.7667489Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:05.7668041Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:06.9139028Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T14:00:06.9142746Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T14:00:06.9250012Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:00:06.9251191Z warnings.warn( 2022-08-17T14:00:06.9252351Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:00:06.9253103Z warnings.warn( 2022-08-17T14:00:07.6426590Z ok (4.915s) 2022-08-17T14:00:07.6426834Z 2022-08-17T14:00:07.6427390Z ---------------------------------------------------------------------- 2022-08-17T14:00:07.6427769Z Ran 8 tests in 40.699s 2022-08-17T14:00:07.6428270Z 2022-08-17T14:00:07.6428365Z OK 2022-08-17T14:00:07.6428505Z 2022-08-17T14:00:07.6428644Z Generating XML reports... 2022-08-17T14:00:07.6471447Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_freezing_weights/TEST-TestFreezingWeights-20220817135926.xml 2022-08-17T14:00:08.0055658Z Running distributed/fsdp/test_fsdp_comm ... [2022-08-17 14:00:08.005019] 2022-08-17T14:00:08.0057068Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_comm.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:00:08.005096] 2022-08-17T14:00:09.6449712Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_comm 2022-08-17T14:00:09.6467455Z 2022-08-17T14:00:09.6467734Z Running tests... 2022-08-17T14:00:09.6468308Z ---------------------------------------------------------------------- 2022-08-17T14:00:09.6488361Z test_communication_nested_model_False_use_no_sync_False_sharding_strategy_None (__main__.TestCommunication) 2022-08-17T14:00:11.1777206Z Tests FSDP's communication cost in terms of calls to collective ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:00:11.1962068Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105404 2022-08-17T14:00:11.1968548Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105405 2022-08-17T14:00:12.6102807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:12.6103570Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:12.6105771Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:12.6106259Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:12.6422554Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:12.6423018Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:12.6427358Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:12.6427852Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:12.7766490Z dist init r=0, world=2 2022-08-17T14:00:12.7770340Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:00:12.8173201Z dist init r=1, world=2 2022-08-17T14:00:12.8178441Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:00:12.8179145Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:12.8179849Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:14.2002653Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:00:14.2003198Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:00:14.2461275Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:14.2461864Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:14.2462922Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:14.2463456Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:15.4075548Z ok (5.760s) 2022-08-17T14:00:15.4093289Z test_communication_nested_model_False_use_no_sync_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunication) 2022-08-17T14:00:15.4107710Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105487 2022-08-17T14:00:15.4113487Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105488 2022-08-17T14:00:16.8663539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:16.8664218Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:16.8666659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:16.8667157Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:16.8772125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:16.8772575Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:16.8777169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:16.8777649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:17.0315597Z dist init r=1, world=2 2022-08-17T14:00:17.0319468Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:00:17.0497375Z dist init r=0, world=2 2022-08-17T14:00:17.0502742Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:00:17.0504132Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:17.0524404Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:18.4436380Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:00:18.4460696Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:00:18.4701468Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:18.4702028Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:18.4898921Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:18.4899485Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:19.7221048Z ok (4.314s) 2022-08-17T14:00:19.7238593Z test_communication_nested_model_False_use_no_sync_True_sharding_strategy_None (__main__.TestCommunication) 2022-08-17T14:00:19.7251842Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105570 2022-08-17T14:00:19.7257814Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105571 2022-08-17T14:00:21.1822820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:21.1823612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:21.1825619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:21.1826111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:21.1879272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:21.1879735Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:21.1883726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:21.1884394Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:21.3543730Z dist init r=1, world=2 2022-08-17T14:00:21.3544350Z dist init r=0, world=2 2022-08-17T14:00:21.3548357Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:00:21.3549338Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:00:21.3550068Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:21.3550772Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:22.7392210Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:00:22.7392743Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:00:22.7961631Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:22.7962254Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:22.8056436Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:22.8057039Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:24.1369223Z ok (4.415s) 2022-08-17T14:00:24.1387358Z test_communication_nested_model_False_use_no_sync_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunication) 2022-08-17T14:00:24.1400972Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105653 2022-08-17T14:00:24.1407027Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105654 2022-08-17T14:00:25.5673229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:25.5673766Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:25.5675807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:25.5676295Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:25.5971007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:25.5971470Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:25.5975747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:25.5976225Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:25.7324363Z dist init r=0, world=2 2022-08-17T14:00:25.7328551Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:00:25.7683860Z dist init r=1, world=2 2022-08-17T14:00:25.7689261Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:00:25.7690029Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:25.7737060Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:27.1588371Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:00:27.1588910Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:00:27.2017901Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:27.2018855Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:27.2028609Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:27.2029168Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:28.4515879Z ok (4.315s) 2022-08-17T14:00:28.4533053Z test_communication_nested_model_True_use_no_sync_False_sharding_strategy_None (__main__.TestCommunication) 2022-08-17T14:00:28.4546747Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105736 2022-08-17T14:00:28.4552445Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105737 2022-08-17T14:00:29.8711002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:29.8711512Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:29.8713723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:29.8714212Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:29.9091610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:29.9092050Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:29.9096347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:29.9096828Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:30.0352982Z dist init r=0, world=2 2022-08-17T14:00:30.0357049Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:00:30.0809059Z dist init r=1, world=2 2022-08-17T14:00:30.0814320Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:00:30.0815080Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:30.0867381Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:31.4598098Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:00:31.4598618Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:00:31.4805978Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:00:31.4807176Z warnings.warn( 2022-08-17T14:00:31.4809507Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:00:31.4810266Z warnings.warn( 2022-08-17T14:00:31.4868435Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:31.4869575Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:31.4870299Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:31.4870850Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:32.3650495Z ok (3.913s) 2022-08-17T14:00:32.3669768Z test_communication_nested_model_True_use_no_sync_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunication) 2022-08-17T14:00:32.3682906Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105819 2022-08-17T14:00:32.3688495Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105820 2022-08-17T14:00:33.8131283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:33.8131864Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:33.8134146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:33.8134641Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:33.8930566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:33.8931055Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:33.8935093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:33.8935561Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:33.9779183Z dist init r=1, world=2 2022-08-17T14:00:33.9783426Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:00:34.0643211Z dist init r=0, world=2 2022-08-17T14:00:34.0648034Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:00:34.0648963Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:34.0700556Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:35.4312747Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:00:35.4313277Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:00:35.4528351Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:00:35.4529248Z warnings.warn( 2022-08-17T14:00:35.4530654Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:00:35.4531432Z warnings.warn( 2022-08-17T14:00:35.4589653Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:35.4590215Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:35.4591045Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:35.4591582Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:36.3788081Z ok (4.014s) 2022-08-17T14:00:36.3807067Z test_communication_nested_model_True_use_no_sync_True_sharding_strategy_None (__main__.TestCommunication) 2022-08-17T14:00:36.3820440Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105902 2022-08-17T14:00:36.3826949Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105903 2022-08-17T14:00:37.8394630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:37.8395143Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:37.8396806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:37.8397294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:37.8435573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:37.8436030Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:37.8440541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:37.8441033Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:38.0059038Z dist init r=1, world=2 2022-08-17T14:00:38.0062317Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:00:38.0173965Z dist init r=0, world=2 2022-08-17T14:00:38.0179091Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:00:38.0180111Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:38.0267484Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:39.3851153Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:00:39.3851680Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:00:39.4048441Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:00:39.4049250Z warnings.warn( 2022-08-17T14:00:39.4050642Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:00:39.4051413Z warnings.warn( 2022-08-17T14:00:39.4111035Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:39.4111638Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:39.4112336Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:39.4113014Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:40.2924845Z ok (3.914s) 2022-08-17T14:00:40.2942673Z test_communication_nested_model_True_use_no_sync_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunication) 2022-08-17T14:00:40.2956670Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 105985 2022-08-17T14:00:40.2962462Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 105986 2022-08-17T14:00:41.7435628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:41.7436140Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:41.7437863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:41.7438370Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:41.7656863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:41.7657329Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:41.7661780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:41.7662259Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:41.9084272Z dist init r=1, world=2 2022-08-17T14:00:41.9088288Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:00:41.9384919Z dist init r=0, world=2 2022-08-17T14:00:41.9390320Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:00:41.9391424Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:41.9394799Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:43.3085616Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:00:43.3086150Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:00:43.3295923Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:00:43.3296715Z warnings.warn( 2022-08-17T14:00:43.3322941Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:00:43.3323728Z warnings.warn( 2022-08-17T14:00:43.3357677Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:43.3358277Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:43.3386587Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:43.3387130Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:44.2059806Z ok (3.913s) 2022-08-17T14:00:44.2060013Z 2022-08-17T14:00:44.2060409Z ---------------------------------------------------------------------- 2022-08-17T14:00:44.2060776Z Ran 8 tests in 34.559s 2022-08-17T14:00:44.2060949Z 2022-08-17T14:00:44.2061047Z OK 2022-08-17T14:00:44.2061183Z 2022-08-17T14:00:44.2061326Z Generating XML reports... 2022-08-17T14:00:44.2109824Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_comm/TEST-TestCommunication-20220817140009.xml 2022-08-17T14:00:44.5648700Z Running distributed/fsdp/test_fsdp_exec_order ... [2022-08-17 14:00:44.564403] 2022-08-17T14:00:44.5649467Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_exec_order.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:00:44.564477] 2022-08-17T14:00:46.2022910Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_exec_order 2022-08-17T14:00:46.2039405Z 2022-08-17T14:00:46.2039655Z Running tests... 2022-08-17T14:00:46.2040112Z ---------------------------------------------------------------------- 2022-08-17T14:00:46.2048904Z test_invalid_first_iter_order_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestFSDPExecOrder) 2022-08-17T14:00:47.7329363Z Tests that FSDP errors if the all-gather order differs across ranks ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:00:47.7508122Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 106103 2022-08-17T14:00:47.7514243Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 106104 2022-08-17T14:00:49.2080529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:49.2081562Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:49.2083109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:49.2084064Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:49.2133302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:49.2134180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:49.2137566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:49.2138518Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:49.3748959Z dist init r=0, world=2 2022-08-17T14:00:49.3753520Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:00:49.3817631Z dist init r=1, world=2 2022-08-17T14:00:49.3821970Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:00:49.3822882Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:49.3856889Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:50.7481992Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:00:50.7482970Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:00:50.7732394Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:00:50.7733973Z warnings.warn( 2022-08-17T14:00:50.7736552Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:00:50.7738043Z warnings.warn( 2022-08-17T14:00:50.7774861Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:50.7775936Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:50.7777248Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:50.7778246Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:51.6627386Z ok (5.458s) 2022-08-17T14:00:51.6633012Z test_invalid_first_iter_order_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestFSDPExecOrder) 2022-08-17T14:00:51.6646613Z Tests that FSDP errors if the all-gather order differs across ranks ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 106182 2022-08-17T14:00:51.6652410Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 106183 2022-08-17T14:00:53.1349480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:53.1350018Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:53.1352586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:53.1353078Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:53.1437940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:53.1438431Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:53.1442202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:53.1442686Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:53.3073409Z dist init r=1, world=2 2022-08-17T14:00:53.3077448Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:00:53.3100600Z dist init r=0, world=2 2022-08-17T14:00:53.3105543Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:00:53.3106435Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:53.3181137Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:54.6816136Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:00:54.6816704Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:00:54.7009483Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:00:54.7010283Z warnings.warn( 2022-08-17T14:00:54.7011417Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:00:54.7012324Z warnings.warn( 2022-08-17T14:00:54.7051880Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:54.7053055Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:54.7053786Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:54.7054330Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:55.5749597Z ok (3.912s) 2022-08-17T14:00:55.5762473Z test_invalid_later_iter_order_sharding_strategy_ShardingStrategy_FULL_SHARD_iters_before_path_change_1 (__main__.TestFSDPExecOrder) 2022-08-17T14:00:55.5775606Z Tests that FSDP warns the user if the all-gather order changes after ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 106261 2022-08-17T14:00:55.5781403Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 106262 2022-08-17T14:00:57.0021087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:57.0021623Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:57.0024074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:57.0271387Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:57.0271997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:00:57.0272458Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:00:57.0276229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:00:57.0276714Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:00:57.1681201Z dist init r=0, world=2 2022-08-17T14:00:57.1685201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:00:57.1995834Z dist init r=1, world=2 2022-08-17T14:00:57.2000249Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:00:57.2001037Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:57.2094052Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:00:58.5665167Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:00:58.5665949Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:00:58.5931263Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:00:58.5932055Z warnings.warn( 2022-08-17T14:00:58.5933175Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:00:58.5934143Z warnings.warn( 2022-08-17T14:00:58.5970481Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:58.5971162Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:58.5972832Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:58.5973380Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:59.0195924Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:00:59.0196514Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:00:59.4873558Z ok (3.912s) 2022-08-17T14:00:59.4886556Z test_invalid_later_iter_order_sharding_strategy_ShardingStrategy_FULL_SHARD_iters_before_path_change_3 (__main__.TestFSDPExecOrder) 2022-08-17T14:00:59.4900411Z Tests that FSDP warns the user if the all-gather order changes after ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 106344 2022-08-17T14:00:59.4906812Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 106345 2022-08-17T14:01:00.9972125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:00.9973095Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:00.9974702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:00.9975646Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:00.9984635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:00.9985571Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:00.9989547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:00.9990520Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:01.1720312Z dist init r=0, world=2 2022-08-17T14:01:01.1724544Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:01:01.1744561Z dist init r=1, world=2 2022-08-17T14:01:01.1749407Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:01:01.1750368Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:01:01.1828097Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:01:02.5656725Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:01:02.5657707Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:01:02.5851253Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:01:02.5852421Z warnings.warn( 2022-08-17T14:01:02.5886465Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:01:02.5887949Z warnings.warn( 2022-08-17T14:01:02.5893365Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:02.5894468Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:02.5928634Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:02.5929774Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:03.0400871Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:03.0402019Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:03.5003897Z ok (4.013s) 2022-08-17T14:01:03.5016492Z test_invalid_later_iter_order_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP_iters_before_path_change_1 (__main__.TestFSDPExecOrder) 2022-08-17T14:01:03.5029655Z Tests that FSDP warns the user if the all-gather order changes after ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 106427 2022-08-17T14:01:03.5035631Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 106428 2022-08-17T14:01:04.9803945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:04.9804446Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:04.9806996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:04.9807488Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:05.0143271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:05.0143717Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:05.0148045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:05.0148530Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:05.1481826Z dist init r=0, world=2 2022-08-17T14:01:05.1485903Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:01:05.1879425Z dist init r=1, world=2 2022-08-17T14:01:05.1883978Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:01:05.1885184Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:01:05.1893894Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:01:06.5584722Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:01:06.5585268Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:01:06.5850195Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:01:06.5851326Z warnings.warn( 2022-08-17T14:01:06.5852464Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:01:06.5853207Z warnings.warn( 2022-08-17T14:01:06.5889830Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:06.5891017Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:06.5891786Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:06.5892334Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:07.0295981Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:07.0296592Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:07.5136134Z ok (4.013s) 2022-08-17T14:01:07.5149898Z test_invalid_later_iter_order_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP_iters_before_path_change_3 (__main__.TestFSDPExecOrder) 2022-08-17T14:01:07.5162747Z Tests that FSDP warns the user if the all-gather order changes after ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 106510 2022-08-17T14:01:07.5168419Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 106511 2022-08-17T14:01:08.9119130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:08.9119625Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:08.9122147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:08.9122642Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:08.9862349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:08.9862812Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:08.9866988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:08.9867469Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:09.0810093Z dist init r=0, world=2 2022-08-17T14:01:09.0813496Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:01:09.1609948Z dist init r=1, world=2 2022-08-17T14:01:09.1614642Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:01:09.1615451Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:01:09.1627453Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:01:10.5190349Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:01:10.5190903Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:01:10.5407738Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:01:10.5408845Z warnings.warn( 2022-08-17T14:01:10.5409962Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:01:10.5410698Z warnings.warn( 2022-08-17T14:01:10.5447323Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:10.5447886Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:10.5448579Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:10.5449109Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:10.9723233Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:10.9723825Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:11.4267981Z ok (3.913s) 2022-08-17T14:01:11.4291822Z test_train_eval_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestFSDPExecOrder) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 106593 2022-08-17T14:01:11.4297895Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 106594 2022-08-17T14:01:12.8843984Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:12.8844492Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:12.8846730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:12.8847230Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:12.8950587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:12.8951054Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:12.8955163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:12.8955659Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:13.0521051Z dist init r=0, world=2 2022-08-17T14:01:13.0524844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:01:13.0671859Z dist init r=1, world=2 2022-08-17T14:01:13.0676608Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:01:13.0677431Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:01:13.0730119Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:01:14.4570506Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:01:14.4571046Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:01:14.4770488Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:01:14.4771265Z warnings.warn( 2022-08-17T14:01:14.4772381Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:01:14.4773114Z warnings.warn( 2022-08-17T14:01:15.4398875Z ok (4.013s) 2022-08-17T14:01:15.4422361Z test_train_eval_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestFSDPExecOrder) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 106676 2022-08-17T14:01:15.4428702Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 106677 2022-08-17T14:01:16.9048585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:16.9049082Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:16.9051760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:16.9052253Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:16.9138905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:16.9139393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:16.9143529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:16.9144017Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:17.0767847Z dist init r=1, world=2 2022-08-17T14:01:17.0771879Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:01:17.0794253Z dist init r=0, world=2 2022-08-17T14:01:17.0798339Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:01:17.0799114Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:01:17.0875649Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:01:18.4486538Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:01:18.4487085Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:01:18.4688188Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:01:18.4689000Z warnings.warn( 2022-08-17T14:01:18.4724401Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:01:18.4725319Z warnings.warn( 2022-08-17T14:01:19.3527294Z ok (3.913s) 2022-08-17T14:01:19.3527537Z 2022-08-17T14:01:19.3527943Z ---------------------------------------------------------------------- 2022-08-17T14:01:19.3528282Z Ran 8 tests in 33.149s 2022-08-17T14:01:19.3528449Z 2022-08-17T14:01:19.3528528Z OK 2022-08-17T14:01:19.3533039Z 2022-08-17T14:01:19.3533348Z Generating XML reports... 2022-08-17T14:01:19.3574339Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_exec_order/TEST-TestFSDPExecOrder-20220817140046.xml 2022-08-17T14:01:19.7028830Z Running distributed/algorithms/ddp_comm_hooks/test_ddp_hooks ... [2022-08-17 14:01:19.702435] 2022-08-17T14:01:19.7029627Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/algorithms/ddp_comm_hooks/test_ddp_hooks.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:01:19.702502] 2022-08-17T14:01:21.2997688Z Test results will be stored in test-reports/python-unittest/distributed.algorithms.ddp_comm_hooks.test_ddp_hooks 2022-08-17T14:01:21.3014169Z 2022-08-17T14:01:21.3014597Z Running tests... 2022-08-17T14:01:21.3015089Z ---------------------------------------------------------------------- 2022-08-17T14:01:21.3022984Z test_ddp_comm_hook_allreduce_hook (__main__.DistributedDataParallelCommHookTest) 2022-08-17T14:01:22.8149532Z This unit test verifies the ``allreduce`` hook registered case gives same result ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:01:22.8339240Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 106794 2022-08-17T14:01:22.8345584Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 106795 2022-08-17T14:01:24.2470753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:24.2471284Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:24.2471921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:24.2472649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:24.2528146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:24.2528846Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:24.2531451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:24.2532190Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:24.4096905Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:01:24.4236051Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:01:25.6961328Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu0i0uofm 2022-08-17T14:01:25.6962377Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu0i0uofm/_remote_module_non_scriptable.py 2022-08-17T14:01:25.7197528Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprqsshyk6 2022-08-17T14:01:25.7200269Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprqsshyk6/_remote_module_non_scriptable.py 2022-08-17T14:01:26.7994303Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:26.7994921Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:26.7995613Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:26.7996499Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:26.8056374Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T14:01:26.8057109Z warnings.warn(msg, FutureWarning) 2022-08-17T14:01:26.8058032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T14:01:26.8058695Z warnings.warn(msg, FutureWarning) 2022-08-17T14:01:27.2458722Z ok (5.944s) 2022-08-17T14:01:27.2464620Z test_ddp_comm_hook_fp16compress_hook (__main__.DistributedDataParallelCommHookTest) 2022-08-17T14:01:27.2478711Z This unit test verifies the ``fp16 compress`` hook registered case ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 106878 2022-08-17T14:01:27.2485385Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 106879 2022-08-17T14:01:28.6461947Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:28.6462451Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:28.6463686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:28.6464194Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:28.6810003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:28.6810466Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:28.6813597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:28.6814089Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:28.8122587Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:01:28.8440293Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:01:30.0730494Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp17xm1jbk 2022-08-17T14:01:30.0731333Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp17xm1jbk/_remote_module_non_scriptable.py 2022-08-17T14:01:30.0937008Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzpn4d09n 2022-08-17T14:01:30.0939668Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzpn4d09n/_remote_module_non_scriptable.py 2022-08-17T14:01:31.1366666Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:31.1367578Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:31.1368372Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:31.1368961Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:31.1431712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T14:01:31.1432412Z warnings.warn(msg, FutureWarning) 2022-08-17T14:01:31.1433353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T14:01:31.1434164Z warnings.warn(msg, FutureWarning) 2022-08-17T14:01:31.6595277Z ok (4.414s) 2022-08-17T14:01:31.6601945Z test_ddp_comm_hook_noop_hook (__main__.DistributedDataParallelCommHookTest) 2022-08-17T14:01:31.6615801Z This unit test verifies the ``noop`` hook registered case and a subsequent allreduce ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 106962 2022-08-17T14:01:31.6622293Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 106963 2022-08-17T14:01:33.1067149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:33.1068169Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:33.1068798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:33.1069285Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:33.1191789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:33.1192241Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:33.1195653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:33.1196136Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:33.2700065Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:01:33.2906822Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:01:34.5414478Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjajkq3e7 2022-08-17T14:01:34.5415065Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjajkq3e7/_remote_module_non_scriptable.py 2022-08-17T14:01:34.5857464Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6iao7wn_ 2022-08-17T14:01:34.5858749Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6iao7wn_/_remote_module_non_scriptable.py 2022-08-17T14:01:35.5996743Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:35.5997351Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:35.5998050Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:35.5998599Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:35.6060995Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T14:01:35.6061803Z warnings.warn(msg, FutureWarning) 2022-08-17T14:01:35.6062740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T14:01:35.6063874Z warnings.warn(msg, FutureWarning) 2022-08-17T14:01:36.0730272Z ok (4.413s) 2022-08-17T14:01:36.0736497Z test_ddp_comm_hook_quantize_per_channel_hook (__main__.DistributedDataParallelCommHookTest) 2022-08-17T14:01:36.0751319Z This unit test verifies the ``quantize per channel`` hook registered case ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 107046 2022-08-17T14:01:36.0757768Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 107047 2022-08-17T14:01:37.5273459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:37.5273995Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:37.5278297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:37.5278806Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:37.5447375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:37.5447851Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:37.5450341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:37.5450807Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:37.6907438Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:01:37.7141227Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:01:38.9495006Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphfsmnp80 2022-08-17T14:01:38.9495608Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphfsmnp80/_remote_module_non_scriptable.py 2022-08-17T14:01:39.0085651Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6yyjr739 2022-08-17T14:01:39.0087024Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6yyjr739/_remote_module_non_scriptable.py 2022-08-17T14:01:40.0053350Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:40.0053970Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:40.0054673Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:40.0055197Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:40.0151172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T14:01:40.0151870Z warnings.warn(msg, FutureWarning) 2022-08-17T14:01:40.0153066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T14:01:40.0153810Z warnings.warn(msg, FutureWarning) 2022-08-17T14:01:40.4866363Z ok (4.413s) 2022-08-17T14:01:40.4871784Z test_ddp_comm_hook_quantize_per_tensor_hook (__main__.DistributedDataParallelCommHookTest) 2022-08-17T14:01:40.4885040Z This unit test verifies the ``quantize per tensor`` hook registered case ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 107130 2022-08-17T14:01:40.4891399Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 107131 2022-08-17T14:01:41.9378788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:41.9379293Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:41.9380383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:41.9380858Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:41.9886096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:41.9886558Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:41.9889534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:41.9889999Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:42.1078952Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:01:42.1569292Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:01:43.4095545Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqe56dgp2 2022-08-17T14:01:43.4096153Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqe56dgp2/_remote_module_non_scriptable.py 2022-08-17T14:01:43.4206257Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdamqz7hw 2022-08-17T14:01:43.4208997Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdamqz7hw/_remote_module_non_scriptable.py 2022-08-17T14:01:44.5002041Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:44.5002645Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:44.5003345Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:44.5003866Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:44.5095347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T14:01:44.5096036Z warnings.warn(msg, FutureWarning) 2022-08-17T14:01:44.5096957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-08-17T14:01:44.5097616Z warnings.warn(msg, FutureWarning) 2022-08-17T14:01:44.9999303Z ok (4.513s) 2022-08-17T14:01:45.0023865Z test_is_last_hook (__main__.DistributedDataParallelCommHookTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 107214 2022-08-17T14:01:45.0030784Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 107215 2022-08-17T14:01:46.4125441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:46.4126207Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:46.4127385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:46.4127870Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:46.4132202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:46.4132660Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:46.4135849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:46.4136506Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:46.5814434Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:01:46.5824545Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:01:49.4443877Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbdzzh1w0 2022-08-17T14:01:49.4444475Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbdzzh1w0/_remote_module_non_scriptable.py 2022-08-17T14:01:49.4467527Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiz266ozu 2022-08-17T14:01:49.4470445Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiz266ozu/_remote_module_non_scriptable.py 2022-08-17T14:01:50.5113475Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:50.5114103Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:50.5114826Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:01:50.5115373Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:01:51.5186383Z ok (6.519s) 2022-08-17T14:01:51.5186718Z 2022-08-17T14:01:51.5187445Z ---------------------------------------------------------------------- 2022-08-17T14:01:51.5187955Z Ran 6 tests in 30.217s 2022-08-17T14:01:51.5188123Z 2022-08-17T14:01:51.5188198Z OK 2022-08-17T14:01:51.5190011Z 2022-08-17T14:01:51.5190333Z Generating XML reports... 2022-08-17T14:01:51.5229614Z Generated XML report: test-reports/python-unittest/distributed.algorithms.ddp_comm_hooks.test_ddp_hooks/TEST-DistributedDataParallelCommHookTest-20220817140121.xml 2022-08-17T14:01:51.8682138Z Running distributed/fsdp/test_fsdp_meta ... [2022-08-17 14:01:51.867782] 2022-08-17T14:01:51.8682861Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_meta.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:01:51.867854] 2022-08-17T14:01:53.4571261Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_meta 2022-08-17T14:01:53.4589248Z 2022-08-17T14:01:53.4589678Z Running tests... 2022-08-17T14:01:53.4590177Z ---------------------------------------------------------------------- 2022-08-17T14:01:54.9279033Z test_bad_arg_meta (__main__.TestFSDPWithMetaDevice) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:01:54.9455214Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 107333 2022-08-17T14:01:54.9461424Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 107334 2022-08-17T14:01:56.3906994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:56.3907512Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:56.3909094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:56.3909604Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:56.4358328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:56.4358772Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:56.4361415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:56.4361899Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:56.5544291Z dist init r=0, world=2 2022-08-17T14:01:56.5548341Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:01:56.6067938Z dist init r=1, world=2 2022-08-17T14:01:56.6072527Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:01:56.6075855Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:01:56.6161814Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:01:57.9725302Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:01:57.9725827Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:01:58.4549090Z ok (4.996s) 2022-08-17T14:01:58.4553348Z test_bad_arg_torchdistx (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.000s) 2022-08-17T14:01:58.4571053Z test_nested_model_with_meta_device_default_init_auto_wrap_False (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 107412 2022-08-17T14:01:58.4577258Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 107413 2022-08-17T14:01:59.9083390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:59.9083904Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:59.9084681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:59.9085155Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:01:59.9711473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:01:59.9711935Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:01:59.9714405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:01:59.9714885Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:00.0735289Z dist init r=1, world=2 2022-08-17T14:02:00.0739432Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:02:00.1356017Z dist init r=0, world=2 2022-08-17T14:02:00.1360084Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:02:00.1363314Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:00.1454536Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:01.4852259Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:02:01.4852810Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:02:01.5062237Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:01.5062866Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:01.5064389Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:01.5064984Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:02.3674036Z ok (3.912s) 2022-08-17T14:02:02.3691566Z test_nested_model_with_meta_device_default_init_auto_wrap_True (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 107495 2022-08-17T14:02:02.3698012Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 107496 2022-08-17T14:02:03.8773548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:03.8774074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:03.8775447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:03.8775930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:03.8841141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:03.8841606Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:03.8844841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:03.8845317Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:04.0489056Z dist init r=1, world=2 2022-08-17T14:02:04.0493195Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:02:04.0549094Z dist init r=0, world=2 2022-08-17T14:02:04.0553780Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:02:04.0557231Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:04.0598836Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:05.4322824Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:02:05.4323331Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:02:05.4592100Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:05.4592681Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:05.4593390Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:05.4593915Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:06.3794984Z ok (4.012s) 2022-08-17T14:02:06.3812228Z test_nested_model_with_meta_device_reset_params_auto_wrap_False (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 107578 2022-08-17T14:02:06.3818254Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 107579 2022-08-17T14:02:07.8221012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:07.8221533Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:07.8222329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:07.8223013Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:07.8635570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:07.8636033Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:07.8639256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:07.8639716Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:07.9868962Z dist init r=1, world=2 2022-08-17T14:02:07.9872171Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:02:08.0342232Z dist init r=0, world=2 2022-08-17T14:02:08.0346892Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:02:08.0350192Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:08.0385587Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:09.3933847Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:02:09.3934370Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:02:09.4142500Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:09.4143048Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:09.4151772Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:09.4152356Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:10.2916380Z ok (3.912s) 2022-08-17T14:02:10.2933300Z test_nested_model_with_meta_device_reset_params_auto_wrap_True (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 107661 2022-08-17T14:02:10.2938878Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 107662 2022-08-17T14:02:11.7310698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:11.7311177Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:11.7312242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:11.7312744Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:11.7609952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:11.7610410Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:11.7613538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:11.7614015Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:11.8974082Z dist init r=1, world=2 2022-08-17T14:02:11.8977976Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:02:11.9345987Z dist init r=0, world=2 2022-08-17T14:02:11.9351110Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:02:11.9354407Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:11.9388563Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:13.3083422Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:02:13.3083984Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:02:13.3339110Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:13.3339668Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:13.3347501Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:13.3348058Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:14.2036456Z ok (3.912s) 2022-08-17T14:02:14.2041233Z test_nested_model_with_torchdistX_default_init_auto_wrap_False (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.001s) 2022-08-17T14:02:14.2046236Z test_nested_model_with_torchdistX_default_init_auto_wrap_True (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.000s) 2022-08-17T14:02:14.2050974Z test_nested_model_with_torchdistX_init_fn_auto_wrap_False (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.000s) 2022-08-17T14:02:14.2056354Z test_nested_model_with_torchdistX_init_fn_auto_wrap_True (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.000s) 2022-08-17T14:02:14.2073697Z test_simple_model_with_meta_device_default_init (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 107744 2022-08-17T14:02:14.2079756Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 107745 2022-08-17T14:02:15.6565607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:15.6566561Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:15.6567751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:15.6568674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:15.6682818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:15.6683720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:15.6685902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:15.6686827Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:15.8247375Z dist init r=0, world=2 2022-08-17T14:02:15.8251720Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:02:15.8379383Z dist init r=1, world=2 2022-08-17T14:02:15.8384030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:02:15.8387581Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:15.8458850Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:17.2191833Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:02:17.2192848Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:02:17.2342185Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:17.2343929Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:17.2345611Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:17.2346646Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:18.1176437Z ok (3.912s) 2022-08-17T14:02:18.1193767Z test_simple_model_with_meta_device_reset_params (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 107827 2022-08-17T14:02:18.1199864Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 107828 2022-08-17T14:02:19.5447448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:19.5447986Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:19.5449126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:19.5449610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:19.5902032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:19.5902505Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:19.5905671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:19.5906153Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:19.7105423Z dist init r=1, world=2 2022-08-17T14:02:19.7109490Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:02:19.7627152Z dist init r=0, world=2 2022-08-17T14:02:19.7632126Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:02:19.7635600Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:19.7723585Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:21.1522733Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:02:21.1523300Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:02:21.1670763Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:21.1671347Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:21.1675968Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:21.1676543Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:22.0297293Z ok (3.912s) 2022-08-17T14:02:22.0302079Z test_simple_model_with_torchdistX_default_init (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.001s) 2022-08-17T14:02:22.0307701Z test_simple_model_with_torchdistX_init_fn (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.000s) 2022-08-17T14:02:22.0308274Z 2022-08-17T14:02:22.0308672Z ---------------------------------------------------------------------- 2022-08-17T14:02:22.0309006Z Ran 14 tests in 28.572s 2022-08-17T14:02:22.0309203Z 2022-08-17T14:02:22.0312778Z OK (skipped=7) 2022-08-17T14:02:22.0312972Z 2022-08-17T14:02:22.0313430Z Generating XML reports... 2022-08-17T14:02:22.0359639Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_meta/TEST-TestFSDPWithMetaDevice-20220817140153.xml 2022-08-17T14:02:22.3843546Z Running distributed/fsdp/test_fsdp_ignored_modules ... [2022-08-17 14:02:22.383923] 2022-08-17T14:02:22.3844297Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_ignored_modules.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:02:22.383995] 2022-08-17T14:02:24.0050886Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_ignored_modules 2022-08-17T14:02:24.0069680Z 2022-08-17T14:02:24.0070130Z Running tests... 2022-08-17T14:02:24.0070807Z ---------------------------------------------------------------------- 2022-08-17T14:02:24.0082075Z test_diff_ignored_modules_across_ranks_pass_ignored_modules_to_root_False (__main__.TestFSDPIgnoredModules) 2022-08-17T14:02:25.5080573Z Tests ignoring different modules across ranks. ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:02:25.5262891Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 107945 2022-08-17T14:02:25.5269599Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 107946 2022-08-17T14:02:27.0107925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:27.0108908Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:27.0110113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:27.0111045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:27.0183984Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:27.0185272Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:27.0190525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:27.0191505Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:27.1757664Z dist init r=0, world=2 2022-08-17T14:02:27.1761627Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:02:27.1852772Z dist init r=1, world=2 2022-08-17T14:02:27.1857562Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:02:27.1858896Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:27.1864915Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:28.5376679Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:02:28.5377657Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:02:28.5615193Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:28.5616281Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:28.5617623Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:28.5618682Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:29.4369199Z ok (5.430s) 2022-08-17T14:02:29.4377675Z test_diff_ignored_modules_across_ranks_pass_ignored_modules_to_root_True (__main__.TestFSDPIgnoredModules) 2022-08-17T14:02:29.4391845Z Tests ignoring different modules across ranks. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108028 2022-08-17T14:02:29.4397955Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108029 2022-08-17T14:02:30.8850892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:30.8851423Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:30.8853714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:30.8854179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:30.9345520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:30.9345987Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:30.9350178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:30.9350848Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:31.0538406Z dist init r=0, world=2 2022-08-17T14:02:31.0542458Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:02:31.1083933Z dist init r=1, world=2 2022-08-17T14:02:31.1088193Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:02:31.1089257Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:31.1153959Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:32.4881570Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:02:32.4882181Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:02:32.5142336Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:32.5142918Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:32.5171162Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:32.5171709Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:33.4498557Z ok (4.013s) 2022-08-17T14:02:33.4505345Z test_ignored_modules_invalid (__main__.TestFSDPIgnoredModules) 2022-08-17T14:02:33.4520712Z Tests that passing an FSDP module as an ignored module or the ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108111 2022-08-17T14:02:33.4526723Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108112 2022-08-17T14:02:34.8744059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:34.8744625Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:34.8747554Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:34.8748048Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:34.8994811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:34.8995259Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:34.8999393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:34.8999868Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:35.0393392Z dist init r=0, world=2 2022-08-17T14:02:35.0397368Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:02:35.0646121Z dist init r=1, world=2 2022-08-17T14:02:35.0650304Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:02:35.0651040Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:35.0704114Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:36.4206325Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:02:36.4206856Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:02:36.8615143Z ok (3.412s) 2022-08-17T14:02:36.8623244Z test_ignored_modules_nested (__main__.TestFSDPIgnoredModules) 2022-08-17T14:02:36.8636752Z Tests that passing a module with nested FSDP modules does not ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108190 2022-08-17T14:02:36.8642844Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108191 2022-08-17T14:02:38.2861011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:38.2861656Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:38.2863546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:38.2864028Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:38.3081161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:38.3081644Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:38.3085909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:38.3086385Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:38.4522620Z dist init r=1, world=2 2022-08-17T14:02:38.4526265Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:02:38.4799849Z dist init r=0, world=2 2022-08-17T14:02:38.4804322Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:02:38.4805109Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:38.4832693Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:39.8749999Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:02:39.8750530Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:02:39.9032478Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:39.9033055Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:39.9033763Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:39.9034312Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:40.7741951Z ok (3.913s) 2022-08-17T14:02:40.7751855Z test_ignored_modules_transformer (__main__.TestFSDPIgnoredModules) 2022-08-17T14:02:40.7765541Z Tests that ignored modules' parameters are not flattened for a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108273 2022-08-17T14:02:40.7771213Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108274 2022-08-17T14:02:42.1604926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:42.1605448Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:42.1607572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:42.1608061Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:42.2048975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:42.2049430Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:42.2053852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:42.2054529Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:42.3269360Z dist init r=1, world=2 2022-08-17T14:02:42.3273295Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:02:42.3712935Z dist init r=0, world=2 2022-08-17T14:02:42.3717189Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:02:42.3718218Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:42.3784207Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:43.7356941Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:02:43.7357453Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:02:43.7869121Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:43.7869706Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:43.7871148Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:02:43.7871701Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:02:44.9871669Z ok (4.213s) 2022-08-17T14:02:44.9871865Z 2022-08-17T14:02:44.9872258Z ---------------------------------------------------------------------- 2022-08-17T14:02:44.9872609Z Ran 5 tests in 20.980s 2022-08-17T14:02:44.9874759Z 2022-08-17T14:02:44.9874950Z OK 2022-08-17T14:02:44.9875120Z 2022-08-17T14:02:44.9875266Z Generating XML reports... 2022-08-17T14:02:44.9913508Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_ignored_modules/TEST-TestFSDPIgnoredModules-20220817140223.xml 2022-08-17T14:02:45.3440485Z Running distributed/_shard/checkpoint/test_file_system_checkpoint ... [2022-08-17 14:02:45.343543] 2022-08-17T14:02:45.3441300Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/checkpoint/test_file_system_checkpoint.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:02:45.343616] 2022-08-17T14:02:46.9722699Z Test results will be stored in test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint 2022-08-17T14:02:46.9741704Z 2022-08-17T14:02:46.9741846Z Running tests... 2022-08-17T14:02:46.9742532Z ---------------------------------------------------------------------- 2022-08-17T14:02:48.4928871Z test_load_rowwise_to_colwise (__main__.TestDistributedReshardOnLoad) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:02:48.5120055Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108391 2022-08-17T14:02:48.5126224Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108392 2022-08-17T14:02:49.9210841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:49.9211862Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:49.9213045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:49.9213964Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:49.9476181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:49.9477120Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:49.9478949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:49.9480137Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:50.0959763Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:02:50.0963824Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:02:50.1262568Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:02:50.1268242Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:02:50.1269668Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:50.1270986Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:52.0217466Z ok (5.047s) 2022-08-17T14:02:52.0252945Z test_load_with_different_shard_plan (__main__.TestDistributedReshardOnLoad) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108472 2022-08-17T14:02:52.0259317Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108473 2022-08-17T14:02:53.4390188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:53.4390811Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:53.4391729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:53.4392208Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:53.4700336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:53.4700788Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:53.4704012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:53.4704510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:53.6099655Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:02:53.6103828Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:02:53.6473173Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:02:53.6478393Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:02:53.6479402Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:53.6512008Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:55.7354414Z ok (3.714s) 2022-08-17T14:02:55.7374134Z test_save_load_bytes (__main__.TestDistributedReshardOnLoad) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108553 2022-08-17T14:02:55.7380636Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108554 2022-08-17T14:02:57.2005662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:57.2006173Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:57.2007078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:57.2007556Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:57.2455683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:02:57.2456145Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:02:57.2458926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:02:57.2459411Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:02:57.3726306Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:02:57.3730175Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:02:57.4263751Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:02:57.4269482Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:02:57.4270603Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:57.4341538Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:02:59.2469347Z ok (3.511s) 2022-08-17T14:02:59.2499488Z test_switch_between_sharded_tensor_to_tensor (__main__.TestDistributedReshardOnLoad) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108634 2022-08-17T14:02:59.2505614Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108635 2022-08-17T14:03:00.6923578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:00.6924095Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:00.6925537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:00.6926026Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:00.7533245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:00.7533726Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:00.7536672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:00.7537164Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:00.8642125Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:00.8646424Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:03:00.9313033Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:00.9318438Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:03:00.9319208Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:00.9360237Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:03.0601340Z ok (3.813s) 2022-08-17T14:03:03.1018763Z test_read_write_only_tensor (__main__.TestDistributedStateDictSaveLoad) ... ok (0.042s) 2022-08-17T14:03:03.1042289Z test_read_write_shard_tensor (__main__.TestDistributedStateDictSaveLoadWithSharedTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108715 2022-08-17T14:03:03.1048219Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108716 2022-08-17T14:03:04.5005683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:04.5006218Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:04.5007504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:04.5007982Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:04.5331470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:04.5331958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:04.5334508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:04.5334990Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:04.6711202Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:04.6715452Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:03:04.7122741Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:04.7127370Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:03:04.7128583Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:04.7225539Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:06.6148653Z ok (3.513s) 2022-08-17T14:03:06.6148863Z 2022-08-17T14:03:06.6149236Z ---------------------------------------------------------------------- 2022-08-17T14:03:06.6149821Z Ran 6 tests in 19.641s 2022-08-17T14:03:06.6151985Z 2022-08-17T14:03:06.6152185Z OK 2022-08-17T14:03:06.6152429Z 2022-08-17T14:03:06.6152598Z Generating XML reports... 2022-08-17T14:03:06.6190416Z Generated XML report: test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint/TEST-TestDistributedReshardOnLoad-20220817140246.xml 2022-08-17T14:03:06.6193275Z Generated XML report: test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint/TEST-TestDistributedStateDictSaveLoad-20220817140246.xml 2022-08-17T14:03:06.6196710Z Generated XML report: test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint/TEST-TestDistributedStateDictSaveLoadWithSharedTensor-20220817140246.xml 2022-08-17T14:03:06.9862385Z Running distributed/_shard/checkpoint/test_checkpoint ... [2022-08-17 14:03:06.985709] 2022-08-17T14:03:06.9863155Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/checkpoint/test_checkpoint.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:03:06.985781] 2022-08-17T14:03:08.6122861Z Test results will be stored in test-reports/python-unittest/distributed._shard.checkpoint.test_checkpoint 2022-08-17T14:03:08.6143570Z 2022-08-17T14:03:08.6144081Z Running tests... 2022-08-17T14:03:08.6144574Z ---------------------------------------------------------------------- 2022-08-17T14:03:10.1417120Z test_storage_key_mapping (__main__.TestDistributedCheckpointing) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:03:10.1610498Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108831 2022-08-17T14:03:10.1616589Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108832 2022-08-17T14:03:11.6012998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:11.6013539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:11.6014499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:11.6014963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:11.6276129Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:11.6276595Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:11.6280057Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:11.6280691Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:11.7729626Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:11.7734070Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:03:11.7947100Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:11.7951853Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:03:11.7952869Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:11.8041352Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:13.5705753Z ok (4.956s) 2022-08-17T14:03:13.5724819Z test_tensor_metadata_with_missing_rank_spec (__main__.TestDistributedCheckpointing) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108910 2022-08-17T14:03:13.5730664Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108911 2022-08-17T14:03:14.9966699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:14.9967248Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:14.9968290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:14.9968806Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:15.0246814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:15.0247292Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:15.0250109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:15.0250600Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:15.1631692Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:15.1635737Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:03:15.1945371Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:15.1950192Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:03:15.1951166Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:15.2044408Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:16.9818353Z ok (3.411s) 2022-08-17T14:03:16.9837509Z test_dummy_writer_works (__main__.TestDistributedFailure) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 108989 2022-08-17T14:03:16.9843236Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 108990 2022-08-17T14:03:16.9849994Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 108991 2022-08-17T14:03:16.9856668Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 108992 2022-08-17T14:03:18.4289543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:18.4290074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:18.4291071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:18.4291800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:18.4517577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:18.4518067Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:18.4520823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:18.4521298Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:18.4999331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:18.4999809Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:18.5002293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:18.5002766Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:18.5005824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:18.5006284Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:18.5009206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:18.5009674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:18.5947700Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:03:18.6174898Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:18.6760519Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:18.6785005Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:03:19.0923322Z skip: Need at least 4 CUDA devices (2.110s) 2022-08-17T14:03:19.0946583Z test_load_error_handling (__main__.TestDistributedFailure) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 109125 2022-08-17T14:03:19.0952822Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 109126 2022-08-17T14:03:19.0959597Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 109127 2022-08-17T14:03:19.0965524Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 109128 2022-08-17T14:03:20.5153258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:20.5154244Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:20.5155396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:20.5156307Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:20.5424659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:20.5425563Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:20.5429073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:20.5430054Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:20.5576535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:20.5577440Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:20.5579094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:20.5580038Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:20.5953776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:20.5954697Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:20.5956214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:20.5957175Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:20.6816497Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:20.7083787Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:20.7222471Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:03:20.7635375Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:03:21.1023582Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T14:03:21.1043149Z test_load_error_handling_no_dist (__main__.TestDistributedFailure) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 109261 2022-08-17T14:03:21.1049040Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 109262 2022-08-17T14:03:21.1054933Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 109263 2022-08-17T14:03:21.1060994Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 109264 2022-08-17T14:03:22.5313504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:22.5314009Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:22.5315037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:22.5315532Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:22.5340897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:22.5341360Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:22.5344550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:22.5345040Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:22.5381982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:22.5382443Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:22.5385896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:22.5386358Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:22.5580076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:22.5580545Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:22.5583609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:22.5584297Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:22.6981270Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:03:22.7012396Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:22.7050125Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:03:22.7316299Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:23.1119986Z ok (2.010s) 2022-08-17T14:03:23.1142290Z test_save_error_handling (__main__.TestDistributedFailure) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 109397 2022-08-17T14:03:23.1148738Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 109398 2022-08-17T14:03:23.1154617Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 109399 2022-08-17T14:03:23.1160952Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 109400 2022-08-17T14:03:24.5437525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:24.5438018Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:24.5438803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:24.5439282Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:24.5849874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:24.5850439Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:24.5852665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:24.5853137Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:24.5929065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:24.5929501Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:24.5932731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:24.5933201Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:24.6393480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:24.6393933Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:24.6396646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:24.6397129Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:24.7093677Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:03:24.7529148Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:03:24.7599942Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:24.8127734Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:25.1225112Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T14:03:25.1244569Z test_save_error_handling_no_dist (__main__.TestDistributedFailure) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 109533 2022-08-17T14:03:25.1250950Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 109534 2022-08-17T14:03:25.1256774Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 109535 2022-08-17T14:03:25.1264386Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 109536 2022-08-17T14:03:26.5855569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:26.5856070Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:26.5857031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:26.5857534Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:26.5884922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:26.5885865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:26.5888036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:26.5888498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:26.5889118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:26.5889738Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:26.5891912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:26.5966518Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:26.5967092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:26.5967727Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:26.5969948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:26.5970418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:26.7552324Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:03:26.7659080Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:26.7700707Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:03:26.7750168Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:27.1320980Z ok (2.010s) 2022-08-17T14:03:27.1336273Z test_create_key_handles_collision (__main__.TestStorageKeys) ... ok (0.001s) 2022-08-17T14:03:27.1336883Z 2022-08-17T14:03:27.1337288Z ---------------------------------------------------------------------- 2022-08-17T14:03:27.1337649Z Ran 8 tests in 18.519s 2022-08-17T14:03:27.1337816Z 2022-08-17T14:03:27.1337936Z OK (skipped=3) 2022-08-17T14:03:27.1338095Z 2022-08-17T14:03:27.1338227Z Generating XML reports... 2022-08-17T14:03:27.1376952Z Generated XML report: test-reports/python-unittest/distributed._shard.checkpoint.test_checkpoint/TEST-TestDistributedCheckpointing-20220817140308.xml 2022-08-17T14:03:27.1385106Z Generated XML report: test-reports/python-unittest/distributed._shard.checkpoint.test_checkpoint/TEST-TestDistributedFailure-20220817140308.xml 2022-08-17T14:03:27.1388641Z Generated XML report: test-reports/python-unittest/distributed._shard.checkpoint.test_checkpoint/TEST-TestStorageKeys-20220817140308.xml 2022-08-17T14:03:27.4885394Z Running distributed/test_c10d_object_collectives ... [2022-08-17 14:03:27.487995] 2022-08-17T14:03:27.4886312Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_object_collectives.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:03:27.488064] 2022-08-17T14:03:29.0593038Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_object_collectives 2022-08-17T14:03:29.0609403Z 2022-08-17T14:03:29.0609834Z Running tests... 2022-08-17T14:03:29.0610547Z ---------------------------------------------------------------------- 2022-08-17T14:03:30.5941398Z test_all_gather_object (__main__.TestObjectCollectives) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:03:30.6131363Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 109704 2022-08-17T14:03:30.6135901Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 109705 2022-08-17T14:03:31.9366778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:31.9367296Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:31.9370396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:31.9371197Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:32.0205528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:32.0205995Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:32.0209923Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:32.0210387Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:32.1013252Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:32.1017783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:03:32.1932969Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:32.1938071Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:03:32.1938924Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:32.2036118Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:34.0223629Z ok (4.961s) 2022-08-17T14:03:34.0241365Z test_broadcast_object_list (__main__.TestObjectCollectives) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 109783 2022-08-17T14:03:34.0247153Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 109784 2022-08-17T14:03:35.4394690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:35.4395360Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:35.4397669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:35.4398179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:35.4547874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:35.4548343Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:35.4552465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:35.4552950Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:35.6045450Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:35.6049753Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:03:35.6230246Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:35.6235892Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:03:35.6236804Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:35.6254557Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:37.4331115Z ok (3.411s) 2022-08-17T14:03:37.4349330Z test_gather_object (__main__.TestObjectCollectives) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 109862 2022-08-17T14:03:37.4355754Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 109863 2022-08-17T14:03:38.8477146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:38.8477832Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:38.8483067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:38.8483910Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:38.8705117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:38.8705589Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:38.8709542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:38.8710028Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:39.0134058Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:39.0138450Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:03:39.0396350Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:39.0400748Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:03:39.0401601Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:39.0445874Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:40.8440407Z ok (3.411s) 2022-08-17T14:03:40.8457474Z test_scatter_object_list (__main__.TestObjectCollectives) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 109943 2022-08-17T14:03:40.8463620Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 109944 2022-08-17T14:03:42.2645839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:42.2646820Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:42.2648414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:42.2649349Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:42.2904082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:42.2905243Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:42.2908465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:42.2909430Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:42.4328664Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:42.4333298Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:03:42.4607133Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:42.4611972Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:03:42.4612729Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:42.4640525Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:44.2551672Z ok (3.411s) 2022-08-17T14:03:44.2551901Z 2022-08-17T14:03:44.2552298Z ---------------------------------------------------------------------- 2022-08-17T14:03:44.2552652Z Ran 4 tests in 15.194s 2022-08-17T14:03:44.2552801Z 2022-08-17T14:03:44.2552896Z OK 2022-08-17T14:03:44.2555494Z 2022-08-17T14:03:44.2555856Z Generating XML reports... 2022-08-17T14:03:44.2592421Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_object_collectives/TEST-TestObjectCollectives-20220817140329.xml 2022-08-17T14:03:44.6190336Z Running distributed/_shard/checkpoint/test_file_system_checkpoint_cpu ... [2022-08-17 14:03:44.618532] 2022-08-17T14:03:44.6191188Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/checkpoint/test_file_system_checkpoint_cpu.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:03:44.618601] 2022-08-17T14:03:46.2579365Z Test results will be stored in test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint_cpu 2022-08-17T14:03:46.2598500Z 2022-08-17T14:03:46.2598741Z Running tests... 2022-08-17T14:03:46.2599168Z ---------------------------------------------------------------------- 2022-08-17T14:03:47.7907697Z test_load_rowwise_to_colwise (__main__.TestDistributedReshardOnLoad) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:03:47.8101290Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 110059 2022-08-17T14:03:47.8108008Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 110060 2022-08-17T14:03:49.2104558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:49.2105076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:49.2107778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:49.2108272Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:49.2373757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:49.2374224Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:49.2378547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:49.2379045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:49.3829996Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:49.4154003Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:49.4366362Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:03:49.4366883Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:03:49.4367641Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:49.4368333Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:49.8163113Z ok (3.556s) 2022-08-17T14:03:49.8198903Z test_load_with_different_shard_plan (__main__.TestDistributedReshardOnLoad) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 110133 2022-08-17T14:03:49.8205135Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 110134 2022-08-17T14:03:51.2499980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:51.2500522Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:51.2502737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:51.2503226Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:51.2632083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:51.2632550Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:51.2636916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:51.2637574Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:51.4242533Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:51.4430119Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:51.4541341Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:03:51.4541869Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:03:51.4542621Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:51.4543510Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:51.9259399Z ok (2.110s) 2022-08-17T14:03:51.9279189Z test_save_load_bytes (__main__.TestDistributedReshardOnLoad) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 110207 2022-08-17T14:03:51.9285071Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 110208 2022-08-17T14:03:53.3268850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:53.3269342Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:53.3272334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:53.3272824Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:53.3514278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:53.3514728Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:53.3519020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:53.3519502Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:53.5000106Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:53.5301206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:53.5415972Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:03:53.5416487Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:03:53.5417220Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:53.5417904Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:53.8335148Z ok (1.907s) 2022-08-17T14:03:53.8365760Z test_switch_between_sharded_tensor_to_tensor (__main__.TestDistributedReshardOnLoad) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 110281 2022-08-17T14:03:53.8371627Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 110282 2022-08-17T14:03:55.2429360Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:55.2429876Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:55.2432303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:55.2432793Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:55.2606333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:55.2606791Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:55.2611144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:55.2611636Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:55.4171523Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:55.4332650Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:55.4547596Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:03:55.4548117Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:03:55.4548835Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:55.4549530Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:56.0429706Z ok (2.209s) 2022-08-17T14:03:56.0639459Z test_read_write_only_tensor (__main__.TestDistributedStateDictSaveLoad) ... ok (0.021s) 2022-08-17T14:03:56.0663085Z test_read_write_shard_tensor (__main__.TestDistributedStateDictSaveLoadWithSharedTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 110355 2022-08-17T14:03:56.0670063Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 110356 2022-08-17T14:03:57.5160936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:57.5161448Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:57.5163383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:57.5163866Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:57.5358355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:03:57.5358833Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:03:57.5363429Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:03:57.5363912Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:03:57.6880334Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:03:57.7155808Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:03:57.7370481Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:03:57.7371058Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:03:57.7371785Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:57.7372709Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:03:58.0722163Z ok (2.008s) 2022-08-17T14:03:58.0722531Z 2022-08-17T14:03:58.0723079Z ---------------------------------------------------------------------- 2022-08-17T14:03:58.0723438Z Ran 6 tests in 11.812s 2022-08-17T14:03:58.0723610Z 2022-08-17T14:03:58.0723705Z OK 2022-08-17T14:03:58.0723851Z 2022-08-17T14:03:58.0723986Z Generating XML reports... 2022-08-17T14:03:58.0762612Z Generated XML report: test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint_cpu/TEST-TestDistributedReshardOnLoad-20220817140346.xml 2022-08-17T14:03:58.0766352Z Generated XML report: test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint_cpu/TEST-TestDistributedStateDictSaveLoad-20220817140346.xml 2022-08-17T14:03:58.0769674Z Generated XML report: test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint_cpu/TEST-TestDistributedStateDictSaveLoadWithSharedTensor-20220817140346.xml 2022-08-17T14:03:58.4310256Z Running distributed/_shard/sharding_plan/test_sharding_plan ... [2022-08-17 14:03:58.430522] 2022-08-17T14:03:58.4311039Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharding_plan/test_sharding_plan.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:03:58.430593] 2022-08-17T14:04:00.0330530Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharding_plan.test_sharding_plan 2022-08-17T14:04:00.0347519Z 2022-08-17T14:04:00.0347918Z Running tests... 2022-08-17T14:04:00.0348430Z ---------------------------------------------------------------------- 2022-08-17T14:04:01.5574294Z test_custom_sharding_planner (__main__.TestShardingPlan) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:04:01.5765413Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 110464 2022-08-17T14:04:01.5771985Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 110465 2022-08-17T14:04:01.5778458Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 110466 2022-08-17T14:04:01.5785248Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 110467 2022-08-17T14:04:03.0067711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:03.0068219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:03.0068981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:03.0069447Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:03.0124479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:03.0124958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:03.0127878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:03.0128338Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:03.0134606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:03.0135055Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:03.0138301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:03.0138761Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:03.0329726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:03.0330193Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:03.0333483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:03.0334008Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:03.1736117Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:03.1841346Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:04:03.1850249Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:04:03.2050251Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:03.5846958Z skip: Need at least 4 CUDA devices (3.550s) 2022-08-17T14:04:03.5877402Z test_reshard_to_ddp_sharding_plan (__main__.TestShardingPlan) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 110600 2022-08-17T14:04:03.5883700Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 110601 2022-08-17T14:04:03.5890332Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 110602 2022-08-17T14:04:03.5896709Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 110603 2022-08-17T14:04:05.0161337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:05.0162217Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:05.0162881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:05.0163362Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:05.0170788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:05.0171270Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:05.0173935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:05.0174427Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:05.0195644Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:05.0196100Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:05.0199217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:05.0199696Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:05.0367746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:05.0368225Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:05.0371241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:05.0371729Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:05.1961057Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:05.1972461Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:04:05.1983493Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:04:05.2106431Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:05.5955353Z skip: Need at least 4 CUDA devices (2.011s) 2022-08-17T14:04:05.5975995Z test_shard_module_sub_process_group (__main__.TestShardingPlan) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 110736 2022-08-17T14:04:05.5982156Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 110737 2022-08-17T14:04:05.5988564Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 110738 2022-08-17T14:04:05.5994986Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 110739 2022-08-17T14:04:07.0229289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:07.0230035Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:07.0230630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:07.0231106Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:07.0302088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:07.0302560Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:07.0306223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:07.0306713Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:07.0307536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:07.0307993Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:07.0310655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:07.0311136Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:07.0507731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:07.0508189Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:07.0511386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:07.0511859Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:07.1882755Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:04:07.2035154Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:07.2046360Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:04:07.2249761Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:07.6053620Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T14:04:07.6079989Z test_sharding_plan_errors (__main__.TestShardingPlan) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 110872 2022-08-17T14:04:07.6086118Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 110873 2022-08-17T14:04:07.6092437Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 110874 2022-08-17T14:04:07.6098732Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 110875 2022-08-17T14:04:09.0384022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:09.0384875Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:09.0386093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:09.0386676Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:09.0537539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:09.0538019Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:09.0540856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:09.0541350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:09.0564311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:09.0564797Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:09.0567301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:09.0567792Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:09.0634782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:09.0635240Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:09.0638077Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:09.0638716Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:09.2032037Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:09.2253791Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:04:09.2254296Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:04:09.2314353Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:09.6160022Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T14:04:09.6213764Z test_sharding_plan_simple_megatron (__main__.TestShardingPlan) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 111008 2022-08-17T14:04:09.6220002Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 111009 2022-08-17T14:04:09.6226529Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 111010 2022-08-17T14:04:09.6232832Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 111011 2022-08-17T14:04:11.0596949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:11.0597472Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:11.0598421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:11.0598909Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:11.1365895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:11.1366389Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:11.1366940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:11.1367417Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:11.1368015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:11.1368496Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:11.1369067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:11.1369534Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:11.1522131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:11.1522598Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:11.1525921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:11.1526412Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:11.2270836Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:04:11.3102186Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:04:11.3102700Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:11.3263817Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:11.7293908Z skip: Need at least 4 CUDA devices (2.113s) 2022-08-17T14:04:11.7294227Z 2022-08-17T14:04:11.7294609Z ---------------------------------------------------------------------- 2022-08-17T14:04:11.7294972Z Ran 5 tests in 11.694s 2022-08-17T14:04:11.7295142Z 2022-08-17T14:04:11.7295254Z OK (skipped=5) 2022-08-17T14:04:11.7295392Z 2022-08-17T14:04:11.7295522Z Generating XML reports... 2022-08-17T14:04:11.7336485Z Generated XML report: test-reports/python-unittest/distributed._shard.sharding_plan.test_sharding_plan/TEST-TestShardingPlan-20220817140400.xml 2022-08-17T14:04:12.0918866Z Running distributed/_shard/test_partial_tensor ... [2022-08-17 14:04:12.091399] 2022-08-17T14:04:12.0919656Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/test_partial_tensor.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:04:12.091471] 2022-08-17T14:04:13.7073679Z Test results will be stored in test-reports/python-unittest/distributed._shard.test_partial_tensor 2022-08-17T14:04:13.7091225Z 2022-08-17T14:04:13.7091512Z Running tests... 2022-08-17T14:04:13.7092155Z ---------------------------------------------------------------------- 2022-08-17T14:04:15.2417456Z test_cat (__main__.TestPartialTensorOps) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:04:15.2608959Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 111179 2022-08-17T14:04:15.2615091Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 111180 2022-08-17T14:04:15.2621321Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 111181 2022-08-17T14:04:15.2628141Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 111182 2022-08-17T14:04:16.7112583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:16.7113070Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:16.7114092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:16.7114571Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:16.7282899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:16.7283393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:16.7286036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:16.7286520Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:16.7301460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:16.7301919Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:16.7304862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:16.7305322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:16.8090578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:16.8091045Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:16.8093142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:16.8093617Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:16.8763014Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:16.8989960Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:04:16.9004474Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:04:16.9841573Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:17.3693061Z skip: Need at least 4 CUDA devices (3.660s) 2022-08-17T14:04:17.3714925Z test_cat_errors (__main__.TestPartialTensorOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 111315 2022-08-17T14:04:17.3721206Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 111316 2022-08-17T14:04:17.3727776Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 111317 2022-08-17T14:04:17.3734502Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 111318 2022-08-17T14:04:18.7911797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:18.7912807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:18.7913979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:18.7914931Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:18.7939863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:18.7940753Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:18.7943602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:18.7944920Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:18.8492842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:18.8493779Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:18.8495429Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:18.8496375Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:18.8681767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:18.8682724Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:18.8684484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:18.8685463Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:18.9643421Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:18.9662832Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:04:19.0190915Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:04:19.0409472Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:19.3793935Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T14:04:19.3812318Z test_transpose (__main__.TestPartialTensorOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 111451 2022-08-17T14:04:19.3818343Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 111452 2022-08-17T14:04:19.3824725Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 111453 2022-08-17T14:04:19.3831061Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 111454 2022-08-17T14:04:20.8883372Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:20.8884086Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:20.8885002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:20.8885533Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:20.8886128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:20.8886580Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:20.8889398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:20.8890119Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:20.9019343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:20.9019799Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:20.9022778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:20.9023259Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:20.9293791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:20.9294246Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:20.9297085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:20.9297580Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:21.0664307Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:21.0673568Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:21.0682541Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:04:21.0950926Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:04:21.4891626Z skip: Need at least 4 CUDA devices (2.110s) 2022-08-17T14:04:21.4916353Z test_partial_tensor_reshard (__main__.TestPartialTensorReshard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 111587 2022-08-17T14:04:21.4922188Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 111588 2022-08-17T14:04:21.4928354Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 111589 2022-08-17T14:04:21.4934697Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 111590 2022-08-17T14:04:22.9527171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:22.9527734Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:22.9528753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:22.9529224Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:22.9603820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:22.9604290Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:22.9607108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:22.9607579Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:22.9694458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:22.9695154Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:22.9697765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:22.9698222Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:22.9768095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:22.9768543Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:22.9771442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:22.9771920Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:23.1209855Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:23.1345435Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:04:23.1358525Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:23.1419289Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:04:23.4994017Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T14:04:23.5017530Z test_partial_tensor_reshard_errors (__main__.TestPartialTensorReshard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 111723 2022-08-17T14:04:23.5023768Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 111724 2022-08-17T14:04:23.5030212Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 111725 2022-08-17T14:04:23.5036380Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 111726 2022-08-17T14:04:24.9145681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:24.9146402Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:24.9147651Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:24.9148142Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:24.9348175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:24.9348641Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:24.9351606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:24.9352064Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:24.9575399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:24.9575865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:24.9578754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:24.9579219Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:24.9988131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:24.9988881Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:24.9990984Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:24.9991435Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:25.0795577Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:25.1011113Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:25.1330285Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:04:25.1724949Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:04:25.5095252Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T14:04:25.5095507Z 2022-08-17T14:04:25.5095895Z ---------------------------------------------------------------------- 2022-08-17T14:04:25.5096236Z Ran 5 tests in 11.800s 2022-08-17T14:04:25.5096388Z 2022-08-17T14:04:25.5096498Z OK (skipped=5) 2022-08-17T14:04:25.5096656Z 2022-08-17T14:04:25.5099791Z Generating XML reports... 2022-08-17T14:04:25.5135595Z Generated XML report: test-reports/python-unittest/distributed._shard.test_partial_tensor/TEST-TestPartialTensorOps-20220817140413.xml 2022-08-17T14:04:25.5139767Z Generated XML report: test-reports/python-unittest/distributed._shard.test_partial_tensor/TEST-TestPartialTensorReshard-20220817140413.xml 2022-08-17T14:04:25.8540258Z Running distributed/_shard/sharded_tensor/ops/test_binary_cmp ... [2022-08-17 14:04:25.853506] 2022-08-17T14:04:25.8541130Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_binary_cmp.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:04:25.853576] 2022-08-17T14:04:27.4369447Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_binary_cmp 2022-08-17T14:04:27.4385501Z 2022-08-17T14:04:27.4385649Z Running tests... 2022-08-17T14:04:27.4386081Z ---------------------------------------------------------------------- 2022-08-17T14:04:27.4396101Z test_torch_allclose (__main__.TestShardedTensorBinaryOps) 2022-08-17T14:04:28.9239074Z Test torch.allclose(ShardedTensor, ShardedTensor) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:04:28.9423874Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 111894 2022-08-17T14:04:28.9430380Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 111895 2022-08-17T14:04:28.9436328Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 111896 2022-08-17T14:04:28.9442596Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 111897 2022-08-17T14:04:30.3643205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:30.3643708Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:30.3644476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:30.3644935Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:30.3819670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:30.3820151Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:30.3823040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:30.3823504Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:30.4153849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:30.4154308Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:30.4157099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:30.4157555Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:30.4330417Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:30.4330882Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:30.4333772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:30.4334287Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:30.5288751Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:04:30.5543147Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:30.5813076Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:04:30.5994870Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:30.9503267Z skip: Need at least 4 CUDA devices (3.511s) 2022-08-17T14:04:30.9519512Z test_torch_allclose_tensor_specs (__main__.TestShardedTensorBinaryOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 112030 2022-08-17T14:04:30.9525623Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 112031 2022-08-17T14:04:30.9532137Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 112032 2022-08-17T14:04:30.9538297Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 112033 2022-08-17T14:04:32.3638450Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:32.3639254Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:32.3640376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:32.3640887Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:32.3810782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:32.3811245Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:32.3814243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:32.3814728Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:32.3862618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:32.3863062Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:32.3866653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:32.3867128Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:32.3976782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:32.3977221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:32.3980090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:32.3980566Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:32.5301496Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:04:32.5457501Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:32.5536229Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:32.5699061Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:04:32.9596155Z skip: Need at least 4 CUDA devices (2.009s) 2022-08-17T14:04:32.9600454Z test_torch_equal (__main__.TestShardedTensorBinaryOps) 2022-08-17T14:04:32.9614448Z Test torch.equal(ShardedTensor, ShardedTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 112166 2022-08-17T14:04:32.9622411Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 112167 2022-08-17T14:04:32.9631022Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 112168 2022-08-17T14:04:32.9638846Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 112169 2022-08-17T14:04:34.3854972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:34.3855449Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:34.3857116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:34.3857597Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:34.3865171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:34.3865855Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:34.3869263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:34.3869742Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:34.4464783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:34.4465233Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:34.4467677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:34.4468157Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:34.4666536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:34.4666989Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:34.4670125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:34.4670604Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:34.5585835Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:04:34.5586319Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:34.6164430Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:34.6387113Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:04:34.9700382Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T14:04:34.9716764Z test_torch_equal_tensor_specs (__main__.TestShardedTensorBinaryOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 112302 2022-08-17T14:04:34.9722733Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 112303 2022-08-17T14:04:34.9728780Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 112304 2022-08-17T14:04:34.9734857Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 112305 2022-08-17T14:04:36.4044441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:36.4044951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:36.4045979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:36.4046474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:36.4143166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:36.4143658Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:36.4146708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:36.4147192Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:36.4267436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:36.4267893Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:36.4270904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:36.4271358Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:36.4532802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:36.4533469Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:36.4536134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:36.4536598Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:36.5707631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:36.5838569Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:36.6007878Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:04:36.6219365Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:04:36.9794649Z skip: Need at least 4 CUDA devices (2.009s) 2022-08-17T14:04:36.9794902Z 2022-08-17T14:04:36.9795271Z ---------------------------------------------------------------------- 2022-08-17T14:04:36.9795628Z Ran 4 tests in 9.541s 2022-08-17T14:04:36.9795777Z 2022-08-17T14:04:36.9795888Z OK (skipped=4) 2022-08-17T14:04:36.9796041Z 2022-08-17T14:04:36.9796169Z Generating XML reports... 2022-08-17T14:04:36.9835482Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_binary_cmp/TEST-TestShardedTensorBinaryOps-20220817140427.xml 2022-08-17T14:04:37.3287855Z Running distributed/fsdp/test_distributed_checkpoint ... [2022-08-17 14:04:37.328281] 2022-08-17T14:04:37.3288619Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_distributed_checkpoint.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:04:37.328353] 2022-08-17T14:04:38.9447500Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_distributed_checkpoint 2022-08-17T14:04:38.9464519Z 2022-08-17T14:04:38.9464663Z Running tests... 2022-08-17T14:04:38.9465103Z ---------------------------------------------------------------------- 2022-08-17T14:04:40.4268527Z test_distributed_checkpoint_state_dict_type_StateDictType_LOCAL_STATE_DICT (__main__.TestDistributedCheckpoint) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:04:40.4448422Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 112473 2022-08-17T14:04:40.4455189Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 112474 2022-08-17T14:04:41.8928419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:41.8928904Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:41.8931370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:41.8931855Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:41.9062044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:41.9062508Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:41.9067179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:41.9067845Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:42.0587678Z dist init r=0, world=2 2022-08-17T14:04:42.0591325Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:04:42.0780720Z dist init r=1, world=2 2022-08-17T14:04:42.0785358Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:04:42.0786607Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:04:42.0796460Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:04:43.4698953Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:43.4699473Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:43.9546161Z ok (5.008s) 2022-08-17T14:04:43.9574136Z test_distributed_checkpoint_state_dict_type_StateDictType_SHARDED_STATE_DICT (__main__.TestDistributedCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 112554 2022-08-17T14:04:43.9579976Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 112555 2022-08-17T14:04:45.4527547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:45.4528032Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:45.4530528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:45.4531030Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:45.4611953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:45.4612423Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:45.4616768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:45.4617258Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:45.6236653Z dist init r=1, world=2 2022-08-17T14:04:45.6240476Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:04:45.6320285Z dist init r=0, world=2 2022-08-17T14:04:45.6325061Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:04:45.6325885Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:04:45.6343649Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:04:47.0115780Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:47.0116321Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:47.4668743Z ok (3.512s) 2022-08-17T14:04:47.4669098Z 2022-08-17T14:04:47.4669718Z ---------------------------------------------------------------------- 2022-08-17T14:04:47.4670273Z Ran 2 tests in 8.520s 2022-08-17T14:04:47.4670565Z 2022-08-17T14:04:47.4670722Z OK 2022-08-17T14:04:47.4670975Z 2022-08-17T14:04:47.4671217Z Generating XML reports... 2022-08-17T14:04:47.4708216Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_distributed_checkpoint/TEST-TestDistributedCheckpoint-20220817140438.xml 2022-08-17T14:04:47.8140995Z Running distributed/_shard/sharded_tensor/ops/test_init ... [2022-08-17 14:04:47.813613] 2022-08-17T14:04:47.8142078Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_init.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:04:47.813685] 2022-08-17T14:04:49.3770295Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_init 2022-08-17T14:04:49.3788096Z 2022-08-17T14:04:49.3788563Z Running tests... 2022-08-17T14:04:49.3789222Z ---------------------------------------------------------------------- 2022-08-17T14:04:49.3803045Z test_init_sharded_tensor_with_kaiming_uniform (__main__.TestShardedTensorNNInit) 2022-08-17T14:04:50.9018342Z Test torch.nn.init.kaiming_uniform_(ShardedTensor, a, mode, nonlinearit) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:04:50.9205491Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 112670 2022-08-17T14:04:50.9212143Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 112671 2022-08-17T14:04:50.9218943Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 112672 2022-08-17T14:04:50.9225572Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 112673 2022-08-17T14:04:52.3432325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:52.3433296Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:52.3434480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:52.3435409Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:52.3436567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:52.3437433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:52.3441821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:52.3442796Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:52.3493411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:52.3494287Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:52.3496700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:52.3497644Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:52.3658193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:52.3658656Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:52.3661029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:52.3661531Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:52.5197956Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:04:52.5238049Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:52.5246233Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:52.5400944Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:04:52.9285639Z skip: Need at least 4 CUDA devices (3.549s) 2022-08-17T14:04:52.9295550Z test_init_sharded_tensor_with_normal (__main__.TestShardedTensorNNInit) 2022-08-17T14:04:52.9309275Z Test torch.nn.init.normal_(ShardedTensor, mean, std) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 112806 2022-08-17T14:04:52.9315425Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 112807 2022-08-17T14:04:52.9321610Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 112808 2022-08-17T14:04:52.9328034Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 112809 2022-08-17T14:04:54.3525627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:54.3526138Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:54.3526990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:54.3527455Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:54.3652485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:54.3652933Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:54.3655766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:54.3656260Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:54.3803163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:54.3803603Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:54.3804164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:54.3804609Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:54.3805835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:54.3806293Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:54.3808670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:54.3809142Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:54.5198159Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:54.5359718Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:54.5548630Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:04:54.5611178Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:04:54.9387253Z skip: Need at least 4 CUDA devices (2.010s) 2022-08-17T14:04:54.9398712Z test_init_sharded_tensor_with_uniform (__main__.TestShardedTensorNNInit) 2022-08-17T14:04:54.9413556Z Test torch.nn.init.uniform_(ShardedTensor, a, b) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 112942 2022-08-17T14:04:54.9420180Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 112943 2022-08-17T14:04:54.9427232Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 112944 2022-08-17T14:04:54.9433522Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 112945 2022-08-17T14:04:56.3488083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:56.3488618Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:56.3489728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:56.3490202Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:56.3603112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:56.3603583Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:56.3606143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:56.3606821Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:56.3815962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:56.3816438Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:56.3819152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:56.3819630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:56.3863515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:04:56.3863977Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:04:56.3867258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:04:56.3867737Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:04:56.5140109Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:04:56.5247689Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:04:56.5524585Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:04:56.5526641Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:04:56.9498966Z skip: Need at least 4 CUDA devices (2.011s) 2022-08-17T14:04:56.9499196Z 2022-08-17T14:04:56.9499559Z ---------------------------------------------------------------------- 2022-08-17T14:04:56.9499880Z Ran 3 tests in 7.571s 2022-08-17T14:04:56.9500042Z 2022-08-17T14:04:56.9500162Z OK (skipped=3) 2022-08-17T14:04:56.9501879Z 2022-08-17T14:04:56.9502127Z Generating XML reports... 2022-08-17T14:04:56.9540486Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_init/TEST-TestShardedTensorNNInit-20220817140449.xml 2022-08-17T14:04:57.3026404Z Running distributed/_shard/sharded_tensor/test_sharded_tensor_reshard ... [2022-08-17 14:04:57.302126] 2022-08-17T14:04:57.3027246Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/test_sharded_tensor_reshard.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:04:57.302198] 2022-08-17T14:04:58.8794982Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor_reshard 2022-08-17T14:04:58.8811771Z 2022-08-17T14:04:58.8812020Z Running tests... 2022-08-17T14:04:58.8812450Z ---------------------------------------------------------------------- 2022-08-17T14:05:00.4073602Z test_sharded_tensor_reshard (__main__.TestReshard) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:05:00.4257877Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 113113 2022-08-17T14:05:00.4264470Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 113114 2022-08-17T14:05:00.4271088Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 113115 2022-08-17T14:05:00.4277499Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 113116 2022-08-17T14:05:01.9260954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:01.9261471Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:01.9262053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:01.9262535Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:01.9511414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:01.9511895Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:01.9514726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:01.9515239Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:01.9592421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:01.9592880Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:01.9595062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:01.9595547Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:01.9615288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:01.9615934Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:01.9618767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:01.9619253Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:02.0986215Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:05:02.1199300Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:05:02.1390586Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:05:02.1447110Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:05:02.5339692Z skip: Need at least 4 CUDA devices (3.652s) 2022-08-17T14:05:02.5362411Z test_sharded_tensor_reshard_errors (__main__.TestReshard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 113249 2022-08-17T14:05:02.5368382Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 113250 2022-08-17T14:05:02.5374762Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 113251 2022-08-17T14:05:02.5380766Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 113252 2022-08-17T14:05:03.9442351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:03.9442897Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:03.9444255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:03.9444786Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:03.9757416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:03.9758114Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:03.9760327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:03.9760936Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:03.9765711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:03.9766388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:03.9769145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:03.9769836Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:04.0447939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:04.0448440Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:04.0450699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:04.0451431Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:04.1097698Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:05:04.1484228Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:05:04.1540491Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:05:04.2179597Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:05:04.5447320Z skip: Need at least 4 CUDA devices (2.011s) 2022-08-17T14:05:04.5447784Z 2022-08-17T14:05:04.5448330Z ---------------------------------------------------------------------- 2022-08-17T14:05:04.5449001Z Ran 2 tests in 5.663s 2022-08-17T14:05:04.5449165Z 2022-08-17T14:05:04.5449282Z OK (skipped=2) 2022-08-17T14:05:04.5449428Z 2022-08-17T14:05:04.5449560Z Generating XML reports... 2022-08-17T14:05:04.5485369Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor_reshard/TEST-TestReshard-20220817140458.xml 2022-08-17T14:05:04.8927292Z Running distributed/fsdp/test_fsdp_multiple_forward ... [2022-08-17 14:05:04.892268] 2022-08-17T14:05:04.8928067Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_multiple_forward.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:05:04.892344] 2022-08-17T14:05:06.4897743Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_multiple_forward 2022-08-17T14:05:06.4915172Z 2022-08-17T14:05:06.4915454Z Running tests... 2022-08-17T14:05:06.4916084Z ---------------------------------------------------------------------- 2022-08-17T14:05:07.9765989Z test_multi_forward (__main__.TestMultiForward) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:05:07.9946749Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 113420 2022-08-17T14:05:07.9952911Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 113421 2022-08-17T14:05:09.4592668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:09.4593646Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:09.4594804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:09.4595758Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:09.4730451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:09.4731425Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:09.4736220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:09.4737181Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:09.6300419Z dist init r=1, world=2 2022-08-17T14:05:09.6304478Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:05:09.6409142Z dist init r=0, world=2 2022-08-17T14:05:09.6414073Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:05:09.6415382Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:05:09.6510174Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:05:11.0159482Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:05:11.0160011Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:05:11.0414638Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:05:11.0415250Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:05:11.0415947Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:05:11.0416502Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:05:11.4484324Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T14:05:11.4485084Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-08-17T14:05:11.4522556Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:05:11.4523384Z warnings.warn( 2022-08-17T14:05:11.4524751Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:05:11.4525523Z warnings.warn( 2022-08-17T14:05:11.9054162Z ok (5.414s) 2022-08-17T14:05:11.9054341Z 2022-08-17T14:05:11.9054715Z ---------------------------------------------------------------------- 2022-08-17T14:05:11.9055034Z Ran 1 test in 5.414s 2022-08-17T14:05:11.9055212Z 2022-08-17T14:05:11.9055303Z OK 2022-08-17T14:05:11.9055442Z 2022-08-17T14:05:11.9055571Z Generating XML reports... 2022-08-17T14:05:11.9090665Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_multiple_forward/TEST-TestMultiForward-20220817140506.xml 2022-08-17T14:05:12.2506564Z Running distributed/fsdp/test_fsdp_pure_fp16 ... [2022-08-17 14:05:12.250192] 2022-08-17T14:05:12.2507344Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_pure_fp16.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:05:12.250264] 2022-08-17T14:05:13.8809546Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_pure_fp16 2022-08-17T14:05:13.8827029Z 2022-08-17T14:05:13.8827327Z Running tests... 2022-08-17T14:05:13.8827917Z ---------------------------------------------------------------------- 2022-08-17T14:05:13.8835362Z test_pure_fp16_cpu_offload_CPUOffload(offload_params=False) (__main__.TestPureFP16) 2022-08-17T14:05:15.4133305Z Tests pure FP16 training, including when the parameter's dtype is ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:05:15.4291507Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/73315 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.546s) 2022-08-17T14:05:15.4296420Z test_pure_fp16_cpu_offload_CPUOffload(offload_params=True) (__main__.TestPureFP16) 2022-08-17T14:05:15.4327817Z Tests pure FP16 training, including when the parameter's dtype is ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 113538 2022-08-17T14:05:15.4334094Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 113539 2022-08-17T14:05:16.8523959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:16.8524728Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:16.8526911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:16.8527401Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:16.8812374Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:16.8812878Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:16.8816190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:16.8816690Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:17.0200331Z dist init r=1, world=2 2022-08-17T14:05:17.0204398Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:05:17.0556635Z dist init r=0, world=2 2022-08-17T14:05:17.0561139Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:05:17.0562168Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:05:17.0613112Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:05:18.4434277Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:05:18.4434790Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:05:18.4723118Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:05:18.4723722Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:05:18.4724438Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:05:18.4724980Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:05:18.8785456Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:05:18.8786257Z warnings.warn( 2022-08-17T14:05:18.8886231Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1082: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-08-17T14:05:18.8886997Z warnings.warn( 2022-08-17T14:05:19.3435823Z ok (3.914s) 2022-08-17T14:05:19.3436046Z 2022-08-17T14:05:19.3436462Z ---------------------------------------------------------------------- 2022-08-17T14:05:19.3436803Z Ran 2 tests in 5.461s 2022-08-17T14:05:19.3436970Z 2022-08-17T14:05:19.3437071Z OK (skipped=1) 2022-08-17T14:05:19.3437229Z 2022-08-17T14:05:19.3437357Z Generating XML reports... 2022-08-17T14:05:19.3474378Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_pure_fp16/TEST-TestPureFP16-20220817140513.xml 2022-08-17T14:05:19.6955972Z Running distributed/elastic/timer/local_timer_test ... [2022-08-17 14:05:19.695118] 2022-08-17T14:05:21.2990966Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/elastic/timer/local_timer_test.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:05:19.695190] 2022-08-17T14:05:21.2991793Z Test results will be stored in test-reports/python-unittest/distributed.elastic.timer.local_timer_test 2022-08-17T14:05:21.3008926Z 2022-08-17T14:05:21.3009076Z Running tests... 2022-08-17T14:05:21.3009513Z ---------------------------------------------------------------------- 2022-08-17T14:05:21.3018170Z test_acquire_release (__main__.LocalTimerServerTest) 2022-08-17T14:05:22.8495746Z tests that: ... ok (1.548s) 2022-08-17T14:05:22.8502542Z test_expired_timers (__main__.LocalTimerServerTest) 2022-08-17T14:05:22.8521371Z tests that a single expired timer on a process should terminate ... ok (0.002s) 2022-08-17T14:05:22.8532086Z test_valid_timers (__main__.LocalTimerServerTest) 2022-08-17T14:05:22.8551336Z tests that valid timers are processed correctly and the process is left alone ... ok (0.003s) 2022-08-17T14:05:22.8558545Z test_watchdog_call_count (__main__.LocalTimerServerTest) 2022-08-17T14:05:22.9594309Z checks that the watchdog function ran wait/interval +- 1 times ... ok (0.104s) 2022-08-17T14:05:22.9596879Z test_watchdog_empty_queue (__main__.LocalTimerServerTest) 2022-08-17T14:05:22.9705751Z checks that the watchdog can run on an empty queue ... ok (0.011s) 2022-08-17T14:05:22.9744726Z test_client_interaction (__main__.LocalTimerTest) ... ok (0.004s) 2022-08-17T14:05:22.9858577Z test_exception_propagation (__main__.LocalTimerTest) ... ok (0.011s) 2022-08-17T14:05:22.9867319Z test_get_timer_recursive (__main__.LocalTimerTest) 2022-08-17T14:05:24.6675962Z If a function acquires a countdown timer with default scope, ... /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:24.6677129Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:24.6678821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:24.6679787Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:25.0435613Z ok (2.057s) 2022-08-17T14:05:25.1468662Z test_happy_path (__main__.LocalTimerTest) ... ok (0.103s) 2022-08-17T14:05:25.1582431Z test_no_client (__main__.LocalTimerTest) ... ok (0.011s) 2022-08-17T14:05:25.3179666Z test_timer (__main__.LocalTimerTest) ... ok (0.159s) 2022-08-17T14:05:25.3454207Z test_get (__main__.MultiprocessingRequestQueueTest) ... ok (0.027s) 2022-08-17T14:05:25.3526924Z test_get_less_than_size (__main__.MultiprocessingRequestQueueTest) 2022-08-17T14:05:25.8727878Z Tests slow producer. ... ok (0.521s) 2022-08-17T14:05:25.8745258Z test_get_size (__main__.MultiprocessingRequestQueueTest) 2022-08-17T14:05:26.7941807Z Creates a "producer" process that enqueues ``n`` elements ... ok (0.921s) 2022-08-17T14:05:26.7947433Z 2022-08-17T14:05:26.7947990Z ---------------------------------------------------------------------- 2022-08-17T14:05:26.7948351Z Ran 14 tests in 5.494s 2022-08-17T14:05:26.7948523Z 2022-08-17T14:05:26.7948620Z OK 2022-08-17T14:05:26.7948757Z 2022-08-17T14:05:26.7948888Z Generating XML reports... 2022-08-17T14:05:26.8045258Z Generated XML report: test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-LocalTimerServerTest-20220817140521.xml 2022-08-17T14:05:26.8053542Z Generated XML report: test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-LocalTimerTest-20220817140521.xml 2022-08-17T14:05:26.8059843Z Generated XML report: test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-MultiprocessingRequestQueueTest-20220817140521.xml 2022-08-17T14:05:27.3016352Z Running distributed/fsdp/test_fsdp_uneven ... [2022-08-17 14:05:27.301155] 2022-08-17T14:05:27.3017127Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_uneven.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:05:27.301225] 2022-08-17T14:05:28.9017150Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_uneven 2022-08-17T14:05:28.9034712Z 2022-08-17T14:05:28.9034834Z Running tests... 2022-08-17T14:05:28.9035515Z ---------------------------------------------------------------------- 2022-08-17T14:05:28.9047377Z test_one_iteration (__main__.TestUnevenParamShard) 2022-08-17T14:05:30.4294859Z Test FSDP with uneven divide of parameter shards. ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:05:30.4481754Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 113746 2022-08-17T14:05:30.4487824Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 113747 2022-08-17T14:05:31.8614473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:31.8615281Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:31.8617309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:31.8617799Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:31.8902791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:31.8903272Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:31.8907363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:31.8907855Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:32.0280577Z dist init r=1, world=2 2022-08-17T14:05:32.0284716Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-08-17T14:05:32.0615716Z dist init r=0, world=2 2022-08-17T14:05:32.0620242Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-08-17T14:05:32.0621070Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:05:32.0692079Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-08-17T14:05:33.4538359Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:05:33.4538900Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:05:33.8984518Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:05:33.8985148Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:05:33.9103704Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/scatter_gather.py:9: UserWarning: is_namedtuple is deprecated, please use the python checks instead 2022-08-17T14:05:33.9104524Z warnings.warn("is_namedtuple is deprecated, please use the python checks instead") 2022-08-17T14:05:34.3590978Z ok (5.455s) 2022-08-17T14:05:34.3591211Z 2022-08-17T14:05:34.3591782Z ---------------------------------------------------------------------- 2022-08-17T14:05:34.3592137Z Ran 1 test in 5.455s 2022-08-17T14:05:34.3592305Z 2022-08-17T14:05:34.3592381Z OK 2022-08-17T14:05:34.3592517Z 2022-08-17T14:05:34.3592662Z Generating XML reports... 2022-08-17T14:05:34.3626156Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_uneven/TEST-TestUnevenParamShard-20220817140528.xml 2022-08-17T14:05:34.7069146Z Running distributed/test_data_parallel ... [2022-08-17 14:05:34.706410] 2022-08-17T14:05:34.7069947Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_data_parallel.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:05:34.706477] 2022-08-17T14:05:37.8212672Z Test results will be stored in test-reports/python-unittest/distributed.test_data_parallel 2022-08-17T14:05:37.8233590Z 2022-08-17T14:05:37.8233737Z Running tests... 2022-08-17T14:05:37.8234162Z ---------------------------------------------------------------------- 2022-08-17T14:05:39.5653447Z test_autocast (__main__.TestDataParallel) ... ok (1.742s) 2022-08-17T14:05:39.7468456Z test_data_parallel (__main__.TestDataParallel) ... ok (0.181s) 2022-08-17T14:05:39.7590391Z test_data_parallel_buffers_requiring_grad (__main__.TestDataParallel) ... ok (0.012s) 2022-08-17T14:05:39.7617518Z test_data_parallel_complex (__main__.TestDataParallel) ... ok (0.003s) 2022-08-17T14:05:39.7675076Z test_data_parallel_device_args (__main__.TestDataParallel) ... ok (0.006s) 2022-08-17T14:05:39.7730676Z test_data_parallel_function_deletion (__main__.TestDataParallel) ... ok (0.005s) 2022-08-17T14:05:39.7744062Z test_data_parallel_lazy_linear (__main__.TestDataParallel) ... /opt/conda/lib/python3.10/site-packages/torch/nn/modules/lazy.py:180: UserWarning: Lazy modules are a new feature under heavy development so changes to the API or functionality can happen at any moment. 2022-08-17T14:05:39.7744914Z warnings.warn('Lazy modules are a new feature under heavy development ' 2022-08-17T14:05:39.7753484Z ok (0.002s) 2022-08-17T14:05:39.7789461Z test_data_parallel_model_device (__main__.TestDataParallel) 2022-08-17T14:05:39.8088285Z Test device[0] check at forward time. ... ok (0.033s) 2022-08-17T14:05:39.8962980Z test_data_parallel_model_no_refcycles (__main__.TestDataParallel) ... ok (0.087s) 2022-08-17T14:05:39.9009810Z test_data_parallel_module_zero_inputs (__main__.TestDataParallel) ... ok (0.005s) 2022-08-17T14:05:39.9063663Z test_data_parallel_multiple_input (__main__.TestDataParallel) ... /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/comm.py:231: UserWarning: Using -1 to represent CPU tensor is deprecated. Please use a device object or string instead, e.g., "cpu". 2022-08-17T14:05:39.9064567Z warnings.warn( 2022-08-17T14:05:39.9218598Z ok (0.021s) 2022-08-17T14:05:39.9247162Z test_data_parallel_nested_input (__main__.TestDataParallel) ... ok (0.003s) 2022-08-17T14:05:39.9304230Z test_data_parallel_nested_output (__main__.TestDataParallel) ... ok (0.006s) 2022-08-17T14:05:39.9344050Z test_data_parallel_no_grad (__main__.TestDataParallel) ... ok (0.004s) 2022-08-17T14:05:40.9431698Z test_data_parallel_rnn (__main__.TestDataParallel) ... ok (1.008s) 2022-08-17T14:05:40.9463880Z test_data_parallel_small_back (__main__.TestDataParallel) ... ok (0.003s) 2022-08-17T14:05:40.9580185Z test_data_parallel_sparse (__main__.TestDataParallel) ... ok (0.012s) 2022-08-17T14:05:40.9794588Z test_gather_cpu (__main__.TestDataParallel) ... /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. 2022-08-17T14:05:40.9795341Z warnings.warn('Was asked to gather along dimension 0, but all ' 2022-08-17T14:05:40.9997534Z ok (0.042s) 2022-08-17T14:05:41.0008582Z test_gather_different_len_dicts (__main__.TestDataParallel) ... ok (0.001s) 2022-08-17T14:05:41.0423937Z test_gather_gpu (__main__.TestDataParallel) ... ok (0.041s) 2022-08-17T14:05:41.0475524Z test_parallel_apply (__main__.TestDataParallel) ... ok (0.005s) 2022-08-17T14:05:41.0530264Z test_parallel_apply_autocast (__main__.TestDataParallel) ... ok (0.005s) 2022-08-17T14:05:41.0551553Z test_parallel_apply_passes_exception (__main__.TestDataParallel) ... ok (0.002s) 2022-08-17T14:05:41.0625064Z test_parameter_list_dict_replica (__main__.TestDataParallel) ... ok (0.007s) 2022-08-17T14:05:41.0667347Z test_replicate (__main__.TestDataParallel) ... ok (0.004s) 2022-08-17T14:05:41.0700785Z test_replicate_buffers (__main__.TestDataParallel) ... ok (0.003s) 2022-08-17T14:05:41.0733047Z test_save_replica_module (__main__.TestDataParallel) ... ok (0.003s) 2022-08-17T14:05:41.0905855Z test_scatter_cpu (__main__.TestDataParallel) ... ok (0.017s) 2022-08-17T14:05:41.1084370Z test_scatter_gpu (__main__.TestDataParallel) ... ok (0.018s) 2022-08-17T14:05:42.2986470Z test_strided_grad_layout (__main__.TestDataParallel) ... ok (1.190s) 2022-08-17T14:05:42.3061716Z test_zero_grad (__main__.TestDataParallel) ... ok (0.008s) 2022-08-17T14:05:42.3112641Z test_data_parallel_module_cuda_float16 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.005s) 2022-08-17T14:05:42.3152470Z test_data_parallel_module_cuda_float32 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.004s) 2022-08-17T14:05:42.3191482Z test_data_parallel_module_cuda_float64 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.004s) 2022-08-17T14:05:42.3235608Z test_data_parallel_module_kwargs_only_cuda_float16 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.004s) 2022-08-17T14:05:42.3279095Z test_data_parallel_module_kwargs_only_cuda_float32 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.004s) 2022-08-17T14:05:42.3320820Z test_data_parallel_module_kwargs_only_cuda_float64 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.004s) 2022-08-17T14:05:42.3364777Z test_data_parallel_module_kwargs_only_empty_dict_cuda_float16 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.004s) 2022-08-17T14:05:42.3408669Z test_data_parallel_module_kwargs_only_empty_dict_cuda_float32 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.004s) 2022-08-17T14:05:42.3452145Z test_data_parallel_module_kwargs_only_empty_dict_cuda_float64 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.004s) 2022-08-17T14:05:42.3495772Z test_data_parallel_module_kwargs_only_empty_list_cuda_float16 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.004s) 2022-08-17T14:05:42.3538639Z test_data_parallel_module_kwargs_only_empty_list_cuda_float32 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.004s) 2022-08-17T14:05:42.3580862Z test_data_parallel_module_kwargs_only_empty_list_cuda_float64 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.004s) 2022-08-17T14:05:42.3626545Z test_data_parallel_module_kwargs_only_empty_tuple_cuda_float16 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.004s) 2022-08-17T14:05:42.3669592Z test_data_parallel_module_kwargs_only_empty_tuple_cuda_float32 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.004s) 2022-08-17T14:05:42.3713354Z test_data_parallel_module_kwargs_only_empty_tuple_cuda_float64 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.004s) 2022-08-17T14:05:42.3714347Z 2022-08-17T14:05:42.3714737Z ---------------------------------------------------------------------- 2022-08-17T14:05:42.3715089Z Ran 46 tests in 4.548s 2022-08-17T14:05:42.3715256Z 2022-08-17T14:05:42.3715350Z OK 2022-08-17T14:05:42.3715465Z 2022-08-17T14:05:42.3715592Z Generating XML reports... 2022-08-17T14:05:42.3781044Z Generated XML report: test-reports/python-unittest/distributed.test_data_parallel/TEST-TestDataParallel-20220817140537.xml 2022-08-17T14:05:42.3798306Z Generated XML report: test-reports/python-unittest/distributed.test_data_parallel/TEST-TestDataParallelDeviceTypeCUDA-20220817140537.xml 2022-08-17T14:05:43.1861690Z Running distributed/elastic/utils/distributed_test ... [2022-08-17 14:05:43.185700] 2022-08-17T14:05:43.1862492Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/elastic/utils/distributed_test.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:05:43.185770] 2022-08-17T14:05:44.8639887Z Test results will be stored in test-reports/python-unittest/distributed.elastic.utils.distributed_test 2022-08-17T14:05:44.8657093Z 2022-08-17T14:05:44.8657245Z Running tests... 2022-08-17T14:05:44.8658373Z ---------------------------------------------------------------------- 2022-08-17T14:05:46.4781478Z test_create_store_multi (__main__.DistributedUtilTest) ... ok (1.612s) 2022-08-17T14:05:46.4796522Z test_create_store_no_port_multi (__main__.DistributedUtilTest) ... ok (0.001s) 2022-08-17T14:05:46.4802450Z test_create_store_single_server (__main__.DistributedUtilTest) ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/66207 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.000s) 2022-08-17T14:05:49.5027737Z test_create_store_timeout_on_server (__main__.DistributedUtilTest) ... ok (3.022s) 2022-08-17T14:05:49.5036982Z test_create_store_timeout_on_worker (__main__.DistributedUtilTest) ... [E socket.cpp:860] [c10d] The client socket has timed out after 1s while trying to connect to (20a8245f1146, 0). 2022-08-17T14:05:49.5037876Z ok (0.001s) 2022-08-17T14:05:49.5053848Z test_port_already_in_use_on_server (__main__.DistributedUtilTest) ... [W socket.cpp:426] [c10d] The server socket has failed to bind to [::]:45105 (errno: 98 - Address already in use). 2022-08-17T14:05:49.5071460Z [W socket.cpp:426] [c10d] The server socket has failed to bind to 0.0.0.0:45105 (errno: 98 - Address already in use). 2022-08-17T14:05:49.5071950Z [E socket.cpp:462] [c10d] The server socket has failed to listen on any local network address. 2022-08-17T14:05:49.5074911Z ok (0.004s) 2022-08-17T14:05:49.5106970Z test_port_already_in_use_on_worker (__main__.DistributedUtilTest) ... [E socket.cpp:860] [c10d] The client socket has timed out after 1s while trying to connect to (20a8245f1146, 57825). 2022-08-17T14:05:49.5107735Z ok (0.003s) 2022-08-17T14:05:49.5110030Z 2022-08-17T14:05:49.5110659Z ---------------------------------------------------------------------- 2022-08-17T14:05:49.5111034Z Ran 7 tests in 4.645s 2022-08-17T14:05:49.5111205Z 2022-08-17T14:05:49.5111328Z OK (skipped=1) 2022-08-17T14:05:49.5111483Z 2022-08-17T14:05:49.5111593Z Generating XML reports... 2022-08-17T14:05:49.5155228Z Generated XML report: test-reports/python-unittest/distributed.elastic.utils.distributed_test/TEST-DistributedUtilTest-20220817140544.xml 2022-08-17T14:05:49.9042537Z Running distributed/_shard/sharded_tensor/test_megatron_prototype ... [2022-08-17 14:05:49.903789] 2022-08-17T14:05:49.9043359Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/test_megatron_prototype.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:05:49.903863] 2022-08-17T14:05:51.5151991Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.test_megatron_prototype 2022-08-17T14:05:51.5167714Z 2022-08-17T14:05:51.5168058Z Running tests... 2022-08-17T14:05:51.5168504Z ---------------------------------------------------------------------- 2022-08-17T14:05:53.0433230Z test_megatron_two_layer_prototype (__main__.TestShardedTensorMegatronLinear) ... INFO:numba.cuda.cudadrv.driver:init 2022-08-17T14:05:53.0617428Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 114119 2022-08-17T14:05:53.0624408Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 114120 2022-08-17T14:05:53.0630594Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 114121 2022-08-17T14:05:53.0636672Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 114122 2022-08-17T14:05:54.4761546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:54.4762368Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:54.4763615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:54.4764106Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:54.4786512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:54.4787011Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:54.4789986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:54.4790708Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:54.4797509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:54.4797982Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:54.4800921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:54.4801408Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:54.5043712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 39 slow tests 2022-08-17T14:05:54.5044190Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-08-17T14:05:54.5046903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 238 disabled tests 2022-08-17T14:05:54.5047401Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-08-17T14:05:54.6456312Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-08-17T14:05:54.6511909Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-08-17T14:05:54.6512410Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-08-17T14:05:54.6758011Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-08-17T14:05:55.0697619Z skip: Need at least 4 CUDA devices (3.553s) 2022-08-17T14:05:55.0698062Z 2022-08-17T14:05:55.0698710Z ---------------------------------------------------------------------- 2022-08-17T14:05:55.0699297Z Ran 1 test in 3.553s 2022-08-17T14:05:55.0699606Z 2022-08-17T14:05:55.0699792Z OK (skipped=1) 2022-08-17T14:05:55.0700073Z 2022-08-17T14:05:55.0700305Z Generating XML reports... 2022-08-17T14:05:55.0737740Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_megatron_prototype/TEST-TestShardedTensorMegatronLinear-20220817140551.xml 2022-08-17T14:05:55.4244706Z Running distributed/elastic/utils/util_test ... [2022-08-17 14:05:55.423995] 2022-08-17T14:05:55.4245461Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/elastic/utils/util_test.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:05:55.424066] 2022-08-17T14:05:56.9691025Z Test results will be stored in test-reports/python-unittest/distributed.elastic.utils.util_test 2022-08-17T14:05:56.9708008Z 2022-08-17T14:05:56.9708607Z Running tests... 2022-08-17T14:05:56.9709474Z ---------------------------------------------------------------------- 2022-08-17T14:05:58.5029372Z test_get_all_rank_0 (__main__.StoreUtilTest) ... ok (1.532s) 2022-08-17T14:05:58.5051825Z test_get_all_rank_n (__main__.StoreUtilTest) ... ok (0.002s) 2022-08-17T14:05:58.5083657Z test_synchronize (__main__.StoreUtilTest) ... ok (0.003s) 2022-08-17T14:05:58.6104755Z test_get_logger (__main__.UtilTest) ... ok (0.102s) 2022-08-17T14:05:58.6112223Z test_get_logger_custom_name (__main__.UtilTest) ... ok (0.001s) 2022-08-17T14:05:58.6122069Z test_get_logger_different (__main__.UtilTest) ... ok (0.001s) 2022-08-17T14:05:58.6135772Z test_get_logger_none (__main__.UtilTest) ... ok (0.001s) 2022-08-17T14:05:58.6136199Z 2022-08-17T14:05:58.6136588Z ---------------------------------------------------------------------- 2022-08-17T14:05:58.6136946Z Ran 7 tests in 1.643s 2022-08-17T14:05:58.6137119Z 2022-08-17T14:05:58.6137214Z OK 2022-08-17T14:05:58.6137350Z 2022-08-17T14:05:58.6137486Z Generating XML reports... 2022-08-17T14:05:58.6181990Z Generated XML report: test-reports/python-unittest/distributed.elastic.utils.util_test/TEST-StoreUtilTest-20220817140556.xml 2022-08-17T14:05:58.6188201Z Generated XML report: test-reports/python-unittest/distributed.elastic.utils.util_test/TEST-UtilTest-20220817140556.xml 2022-08-17T14:05:58.9688571Z Running distributed/fsdp/test_checkpoint_wrapper ... [2022-08-17 14:05:58.968392] 2022-08-17T14:05:58.9689620Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_checkpoint_wrapper.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:05:58.968465] 2022-08-17T14:06:00.5429649Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_checkpoint_wrapper 2022-08-17T14:06:00.5446497Z 2022-08-17T14:06:00.5446859Z Running tests... 2022-08-17T14:06:00.5447298Z ---------------------------------------------------------------------- 2022-08-17T14:06:00.5470892Z test_apply_activation_checkpointing_wrapper (__main__.CheckpointWrapperTest) 2022-08-17T14:06:02.1437198Z Ensures that `apply_activation_checkpointing_wrapper` can be used ... ok (1.599s) 2022-08-17T14:06:02.1456832Z test_checkpoint_wrapper_parity (__main__.CheckpointWrapperTest) 2022-08-17T14:06:02.1461001Z Tests that using checkpoint_wrapper or the functional ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/79510 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.002s) 2022-08-17T14:06:02.1474338Z test_forward_missing_attributes (__main__.CheckpointWrapperTest) ... ok (0.001s) 2022-08-17T14:06:02.1485112Z test_fqn (__main__.CheckpointWrapperTest) ... ok (0.001s) 2022-08-17T14:06:02.1519187Z test_load_activation_checkpointed_module (__main__.CheckpointWrapperTest) ... ok (0.003s) 2022-08-17T14:06:02.1519789Z 2022-08-17T14:06:02.1520085Z ---------------------------------------------------------------------- 2022-08-17T14:06:02.1520441Z Ran 5 tests in 1.607s 2022-08-17T14:06:02.1520611Z 2022-08-17T14:06:02.1520724Z OK (skipped=1) 2022-08-17T14:06:02.1520890Z 2022-08-17T14:06:02.1521000Z Generating XML reports... 2022-08-17T14:06:02.1561429Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_checkpoint_wrapper/TEST-CheckpointWrapperTest-20220817140600.xml 2022-08-17T14:06:02.5335095Z Running distributed/_shard/checkpoint/test_utils ... [2022-08-17 14:06:02.533006] 2022-08-17T14:06:02.5335864Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/checkpoint/test_utils.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:06:02.533075] 2022-08-17T14:06:04.0982167Z Test results will be stored in test-reports/python-unittest/distributed._shard.checkpoint.test_utils 2022-08-17T14:06:04.0999620Z 2022-08-17T14:06:04.0999855Z Running tests... 2022-08-17T14:06:04.1000323Z ---------------------------------------------------------------------- 2022-08-17T14:06:05.6101662Z test_flat_data (__main__.TestMedatadaIndex) ... ok (1.510s) 2022-08-17T14:06:05.6111016Z test_index_hint_ignored_on_equals (__main__.TestMedatadaIndex) ... ok (0.001s) 2022-08-17T14:06:05.6119545Z test_index_hint_ignored_on_hash (__main__.TestMedatadaIndex) ... ok (0.001s) 2022-08-17T14:06:05.6128259Z test_init_convert_offset (__main__.TestMedatadaIndex) ... ok (0.001s) 2022-08-17T14:06:05.6162095Z test_sharded_tensor_lookup (__main__.TestMedatadaIndex) ... ok (0.003s) 2022-08-17T14:06:05.6162349Z 2022-08-17T14:06:05.6162738Z ---------------------------------------------------------------------- 2022-08-17T14:06:05.6163092Z Ran 5 tests in 1.516s 2022-08-17T14:06:05.6163263Z 2022-08-17T14:06:05.6163338Z OK 2022-08-17T14:06:05.6163476Z 2022-08-17T14:06:05.6163604Z Generating XML reports... 2022-08-17T14:06:05.6201538Z Generated XML report: test-reports/python-unittest/distributed._shard.checkpoint.test_utils/TEST-TestMedatadaIndex-20220817140604.xml 2022-08-17T14:06:05.9624027Z Running distributed/elastic/utils/logging_test ... [2022-08-17 14:06:05.961912] 2022-08-17T14:06:05.9624811Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/elastic/utils/logging_test.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:06:05.961985] 2022-08-17T14:06:07.5926293Z Test results will be stored in test-reports/python-unittest/distributed.elastic.utils.logging_test 2022-08-17T14:06:07.5944469Z 2022-08-17T14:06:07.5944958Z Running tests... 2022-08-17T14:06:07.5945477Z ---------------------------------------------------------------------- 2022-08-17T14:06:09.1592325Z test_derive_module_name (__main__.LoggingTest) ... ok (1.564s) 2022-08-17T14:06:09.1617406Z test_logger_name (__main__.LoggingTest) ... ok (0.002s) 2022-08-17T14:06:09.1617874Z 2022-08-17T14:06:09.1618550Z ---------------------------------------------------------------------- 2022-08-17T14:06:09.1619191Z Ran 2 tests in 1.567s 2022-08-17T14:06:09.1619511Z 2022-08-17T14:06:09.1619672Z OK 2022-08-17T14:06:09.1619919Z 2022-08-17T14:06:09.1620161Z Generating XML reports... 2022-08-17T14:06:09.1657401Z Generated XML report: test-reports/python-unittest/distributed.elastic.utils.logging_test/TEST-LoggingTest-20220817140607.xml 2022-08-17T14:06:09.5246975Z Running distributed/test_launcher ... [2022-08-17 14:06:09.524207] 2022-08-17T14:06:09.5247734Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_launcher.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:06:09.524276] 2022-08-17T14:06:11.6112443Z Test results will be stored in test-reports/python-unittest/distributed.test_launcher 2022-08-17T14:06:11.6128194Z 2022-08-17T14:06:11.6128425Z Running tests... 2022-08-17T14:06:11.6128860Z ---------------------------------------------------------------------- 2022-08-17T14:06:13.1621200Z test_launch_user_script (__main__.TestDistributedLaunch) ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/79488 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.549s) 2022-08-17T14:06:13.1621866Z 2022-08-17T14:06:13.1622168Z ---------------------------------------------------------------------- 2022-08-17T14:06:13.1622506Z Ran 1 test in 1.549s 2022-08-17T14:06:13.1622664Z 2022-08-17T14:06:13.1622779Z OK (skipped=1) 2022-08-17T14:06:13.1622937Z 2022-08-17T14:06:13.1623063Z Generating XML reports... 2022-08-17T14:06:13.1654661Z Generated XML report: test-reports/python-unittest/distributed.test_launcher/TEST-TestDistributedLaunch-20220817140611.xml 2022-08-17T14:06:13.5209169Z Running distributed/_shard/test_replicated_tensor ... [2022-08-17 14:06:13.520419] 2022-08-17T14:06:13.5209944Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/test_replicated_tensor.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:06:13.520492] 2022-08-17T14:06:15.3369431Z Running distributed/elastic/events/lib_test ... [2022-08-17 14:06:15.336489] 2022-08-17T14:06:15.3370149Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/elastic/events/lib_test.py', '-v'] ... [2022-08-17 14:06:15.336560] 2022-08-17T14:06:16.1526762Z ============================= test session starts ============================== 2022-08-17T14:06:16.1527353Z platform linux -- Python 3.10.4, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python 2022-08-17T14:06:16.1627793Z cachedir: .pytest_cache 2022-08-17T14:06:16.1628381Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-08-17T14:06:16.1628893Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-08-17T14:06:16.1629392Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0 2022-08-17T14:06:17.5924251Z collecting ...  2022-08-17T14:06:17.5934951Z collecting 3 items  2022-08-17T14:06:17.5935404Z collected 8 items  2022-08-17T14:06:17.5940218Z 2022-08-17T14:06:17.5956942Z distributed/elastic/events/lib_test.py::EventLibTest::test_event_created PASSED [ 12%] 2022-08-17T14:06:17.5971115Z distributed/elastic/events/lib_test.py::EventLibTest::test_event_deser PASSED [ 25%] 2022-08-17T14:06:17.5990276Z distributed/elastic/events/lib_test.py::EventLibTest::test_get_or_create_logger PASSED [ 37%] 2022-08-17T14:06:17.6888073Z distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_construct_and_record_rdzv_event PASSED [ 50%] 2022-08-17T14:06:17.6907405Z distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_construct_and_record_rdzv_event_does_not_run_if_invalid_dest PASSED [ 62%] 2022-08-17T14:06:17.6919943Z distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_rdzv_event_created PASSED [ 75%] 2022-08-17T14:06:17.6933291Z distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_rdzv_event_deserialize PASSED [ 87%] 2022-08-17T14:06:17.6953819Z distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_rdzv_event_str PASSED [100%] 2022-08-17T14:06:17.6955151Z 2022-08-17T14:06:17.6955558Z ============================== 8 passed in 1.54s =============================== 2022-08-17T14:06:17.9454818Z Running distributed/fsdp/test_shard_utils ... [2022-08-17 14:06:17.945039] 2022-08-17T14:06:17.9455550Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_shard_utils.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:06:17.945115] 2022-08-17T14:06:19.7370914Z Running distributed/pipeline/sync/skip/test_gpipe ... [2022-08-17 14:06:19.736632] 2022-08-17T14:06:19.7371595Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_gpipe.py', '-v'] ... [2022-08-17 14:06:19.736701] 2022-08-17T14:06:21.5262530Z ============================= test session starts ============================== 2022-08-17T14:06:21.5263135Z platform linux -- Python 3.10.4, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python 2022-08-17T14:06:21.5335250Z cachedir: .pytest_cache 2022-08-17T14:06:21.5336143Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-08-17T14:06:21.5336597Z torch: 1.13.0a0+gitce6a3c6 2022-08-17T14:06:21.5336932Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-08-17T14:06:21.5337441Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0 2022-08-17T14:06:21.5633394Z collecting ...  2022-08-17T14:06:21.5633829Z collected 13 items  2022-08-17T14:06:21.5638126Z 2022-08-17T14:06:24.1881162Z distributed/pipeline/sync/skip/test_gpipe.py::test_1to3[never-3] PASSED [ 7%] 2022-08-17T14:06:26.0826976Z distributed/pipeline/sync/skip/test_gpipe.py::test_1to3[never-1:2] PASSED [ 15%] 2022-08-17T14:06:26.1421483Z distributed/pipeline/sync/skip/test_gpipe.py::test_1to3[never-2:1] PASSED [ 23%] 2022-08-17T14:06:26.1573089Z distributed/pipeline/sync/skip/test_gpipe.py::test_1to3[never-1:1:1] SKIPPED [ 30%] 2022-08-17T14:06:26.2113207Z distributed/pipeline/sync/skip/test_gpipe.py::test_1to3[always-3] PASSED [ 38%] 2022-08-17T14:06:26.2828610Z distributed/pipeline/sync/skip/test_gpipe.py::test_1to3[always-1:2] PASSED [ 46%] 2022-08-17T14:06:26.3449246Z distributed/pipeline/sync/skip/test_gpipe.py::test_1to3[always-2:1] PASSED [ 53%] 2022-08-17T14:06:26.3598631Z distributed/pipeline/sync/skip/test_gpipe.py::test_1to3[always-1:1:1] SKIPPED [ 61%] 2022-08-17T14:06:26.4101943Z distributed/pipeline/sync/skip/test_gpipe.py::test_1to3[except_last-3] PASSED [ 69%] 2022-08-17T14:06:26.4612079Z distributed/pipeline/sync/skip/test_gpipe.py::test_1to3[except_last-1:2] PASSED [ 76%] 2022-08-17T14:06:26.5084253Z distributed/pipeline/sync/skip/test_gpipe.py::test_1to3[except_last-2:1] PASSED [ 84%] 2022-08-17T14:06:26.5235644Z distributed/pipeline/sync/skip/test_gpipe.py::test_1to3[except_last-1:1:1] SKIPPED [ 92%] 2022-08-17T14:06:26.5511295Z distributed/pipeline/sync/skip/test_gpipe.py::test_none_skip PASSED [100%] 2022-08-17T14:06:26.5512442Z 2022-08-17T14:06:26.5512795Z ======================== 10 passed, 3 skipped in 5.03s ========================= 2022-08-17T14:06:27.2369980Z Running distributed/pipeline/sync/skip/test_leak ... [2022-08-17 14:06:27.236493] 2022-08-17T14:06:27.2370759Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_leak.py', '-v'] ... [2022-08-17 14:06:27.236569] 2022-08-17T14:06:29.0464487Z ============================= test session starts ============================== 2022-08-17T14:06:29.0465423Z platform linux -- Python 3.10.4, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python 2022-08-17T14:06:29.0535853Z cachedir: .pytest_cache 2022-08-17T14:06:29.0536445Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-08-17T14:06:29.0536891Z torch: 1.13.0a0+gitce6a3c6 2022-08-17T14:06:29.0537228Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-08-17T14:06:29.0537707Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0 2022-08-17T14:06:29.0729819Z collecting ...  2022-08-17T14:06:29.0730233Z collected 8 items  2022-08-17T14:06:29.0734623Z 2022-08-17T14:06:29.1648052Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[always-train] PASSED [ 12%] 2022-08-17T14:06:29.1831837Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[always-eval] PASSED [ 25%] 2022-08-17T14:06:29.2038733Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[except_last-train] PASSED [ 37%] 2022-08-17T14:06:29.2218108Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[except_last-eval] PASSED [ 50%] 2022-08-17T14:06:29.2407137Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[never-train] PASSED [ 62%] 2022-08-17T14:06:29.2589869Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[never-eval] PASSED [ 75%] 2022-08-17T14:06:29.2746519Z distributed/pipeline/sync/skip/test_leak.py::test_no_portal_without_pipe[train] PASSED [ 87%] 2022-08-17T14:06:29.2903021Z distributed/pipeline/sync/skip/test_leak.py::test_no_portal_without_pipe[eval] PASSED [100%] 2022-08-17T14:06:29.2904995Z 2022-08-17T14:06:29.2905584Z ============================== 8 passed in 0.24s =============================== 2022-08-17T14:06:29.5293855Z Running distributed/pipeline/sync/skip/test_stash_pop ... [2022-08-17 14:06:29.528939] 2022-08-17T14:06:29.5294543Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_stash_pop.py', '-v'] ... [2022-08-17 14:06:29.529011] 2022-08-17T14:06:31.3368109Z ============================= test session starts ============================== 2022-08-17T14:06:31.3368687Z platform linux -- Python 3.10.4, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python 2022-08-17T14:06:31.3439914Z cachedir: .pytest_cache 2022-08-17T14:06:31.3440517Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-08-17T14:06:31.3440971Z torch: 1.13.0a0+gitce6a3c6 2022-08-17T14:06:31.3441310Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-08-17T14:06:31.3441787Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0 2022-08-17T14:06:31.3624232Z collecting ...  2022-08-17T14:06:31.3624801Z collected 7 items  2022-08-17T14:06:31.3629771Z 2022-08-17T14:06:31.3844473Z distributed/pipeline/sync/skip/test_stash_pop.py::test_stash PASSED [ 14%] 2022-08-17T14:06:31.3864775Z distributed/pipeline/sync/skip/test_stash_pop.py::test_pop PASSED [ 28%] 2022-08-17T14:06:31.3887032Z distributed/pipeline/sync/skip/test_stash_pop.py::test_declare_but_not_use PASSED [ 42%] 2022-08-17T14:06:31.3904658Z distributed/pipeline/sync/skip/test_stash_pop.py::test_stash_not_declared PASSED [ 57%] 2022-08-17T14:06:31.3923736Z distributed/pipeline/sync/skip/test_stash_pop.py::test_pop_not_declared PASSED [ 71%] 2022-08-17T14:06:31.3940903Z distributed/pipeline/sync/skip/test_stash_pop.py::test_pop_not_stashed PASSED [ 85%] 2022-08-17T14:06:31.3961767Z distributed/pipeline/sync/skip/test_stash_pop.py::test_stash_none PASSED [100%] 2022-08-17T14:06:31.3962814Z 2022-08-17T14:06:31.3963135Z ============================== 7 passed in 0.06s =============================== 2022-08-17T14:06:31.6292263Z Running distributed/pipeline/sync/skip/test_verify_skippables ... [2022-08-17 14:06:31.628776] 2022-08-17T14:06:31.6292950Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_verify_skippables.py', '-v'] ... [2022-08-17 14:06:31.628846] 2022-08-17T14:06:33.4340484Z ============================= test session starts ============================== 2022-08-17T14:06:33.4341074Z platform linux -- Python 3.10.4, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python 2022-08-17T14:06:33.4412542Z cachedir: .pytest_cache 2022-08-17T14:06:33.4413167Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-08-17T14:06:33.4413613Z torch: 1.13.0a0+gitce6a3c6 2022-08-17T14:06:33.4413948Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-08-17T14:06:33.4414460Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0 2022-08-17T14:06:33.4629340Z collecting ...  2022-08-17T14:06:33.4629759Z collected 9 items  2022-08-17T14:06:33.4634294Z 2022-08-17T14:06:33.4667785Z distributed/pipeline/sync/skip/test_verify_skippables.py::test_matching PASSED [ 11%] 2022-08-17T14:06:33.4685773Z distributed/pipeline/sync/skip/test_verify_skippables.py::test_stash_not_pop PASSED [ 22%] 2022-08-17T14:06:33.4703161Z distributed/pipeline/sync/skip/test_verify_skippables.py::test_pop_unknown PASSED [ 33%] 2022-08-17T14:06:33.4723612Z distributed/pipeline/sync/skip/test_verify_skippables.py::test_stash_again PASSED [ 44%] 2022-08-17T14:06:33.4742227Z distributed/pipeline/sync/skip/test_verify_skippables.py::test_pop_again PASSED [ 55%] 2022-08-17T14:06:33.4761154Z distributed/pipeline/sync/skip/test_verify_skippables.py::test_stash_pop_together_different_names PASSED [ 66%] 2022-08-17T14:06:33.4778274Z distributed/pipeline/sync/skip/test_verify_skippables.py::test_stash_pop_together_same_name PASSED [ 77%] 2022-08-17T14:06:33.4799589Z distributed/pipeline/sync/skip/test_verify_skippables.py::test_double_stash_pop PASSED [ 88%] 2022-08-17T14:06:33.4821852Z distributed/pipeline/sync/skip/test_verify_skippables.py::test_double_stash_pop_but_isolated PASSED [100%] 2022-08-17T14:06:33.4822736Z 2022-08-17T14:06:33.4823051Z ============================== 9 passed in 0.05s =============================== 2022-08-17T14:06:33.7420179Z Running distributed/pipeline/sync/test_bugs ... [2022-08-17 14:06:33.741579] 2022-08-17T14:06:33.7420839Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_bugs.py', '-v'] ... [2022-08-17 14:06:33.741652] 2022-08-17T14:06:35.6039195Z ============================= test session starts ============================== 2022-08-17T14:06:35.6039808Z platform linux -- Python 3.10.4, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python 2022-08-17T14:06:35.6113223Z cachedir: .pytest_cache 2022-08-17T14:06:35.6114365Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-08-17T14:06:35.6114855Z torch: 1.13.0a0+gitce6a3c6 2022-08-17T14:06:35.6115191Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-08-17T14:06:35.6115699Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0 2022-08-17T14:06:35.6536993Z collecting ...  2022-08-17T14:06:35.6537410Z collected 4 items  2022-08-17T14:06:35.6541970Z 2022-08-17T14:06:35.7345516Z distributed/pipeline/sync/test_bugs.py::test_python_autograd_function PASSED [ 25%] 2022-08-17T14:06:35.7525321Z distributed/pipeline/sync/test_bugs.py::test_exception_no_hang PASSED [ 50%] 2022-08-17T14:06:39.3600145Z distributed/pipeline/sync/test_bugs.py::test_tuple_wait PASSED [ 75%] 2022-08-17T14:06:39.4926966Z distributed/pipeline/sync/test_bugs.py::test_parallel_randoms PASSED [100%] 2022-08-17T14:06:39.4928476Z 2022-08-17T14:06:39.4928964Z ============================== 4 passed in 3.89s =============================== 2022-08-17T14:06:39.8452609Z Running distributed/pipeline/sync/test_copy ... [2022-08-17 14:06:39.844759] 2022-08-17T14:06:39.8453264Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_copy.py', '-v'] ... [2022-08-17 14:06:39.844833] 2022-08-17T14:06:41.6731136Z ============================= test session starts ============================== 2022-08-17T14:06:41.6731694Z platform linux -- Python 3.10.4, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python 2022-08-17T14:06:41.6803117Z cachedir: .pytest_cache 2022-08-17T14:06:41.6804034Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-08-17T14:06:41.6804512Z torch: 1.13.0a0+gitce6a3c6 2022-08-17T14:06:41.6804822Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-08-17T14:06:41.6805319Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0 2022-08-17T14:06:41.7225401Z collecting ...  2022-08-17T14:06:41.7225856Z collected 5 items  2022-08-17T14:06:41.7230346Z 2022-08-17T14:06:41.7287038Z distributed/pipeline/sync/test_copy.py::test_copy_wait_cpu_cpu PASSED [ 20%] 2022-08-17T14:06:42.9665389Z distributed/pipeline/sync/test_copy.py::test_copy_wait_cpu_cuda PASSED [ 40%] 2022-08-17T14:06:43.4164316Z distributed/pipeline/sync/test_copy.py::test_copy_wait_cuda_cpu PASSED [ 60%] 2022-08-17T14:06:43.7746805Z distributed/pipeline/sync/test_copy.py::test_copy_wait_cuda_cuda PASSED [ 80%] 2022-08-17T14:06:43.7766303Z distributed/pipeline/sync/test_copy.py::test_wait_multiple_tensors PASSED [100%] 2022-08-17T14:06:43.7767521Z 2022-08-17T14:06:43.7767863Z ============================== 5 passed in 2.10s =============================== 2022-08-17T14:06:44.0877144Z Running distributed/pipeline/sync/test_dependency ... [2022-08-17 14:06:44.087261] 2022-08-17T14:06:44.0878037Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_dependency.py', '-v'] ... [2022-08-17 14:06:44.087334] 2022-08-17T14:06:45.8963203Z ============================= test session starts ============================== 2022-08-17T14:06:45.8963795Z platform linux -- Python 3.10.4, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python 2022-08-17T14:06:45.9033428Z cachedir: .pytest_cache 2022-08-17T14:06:45.9034594Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-08-17T14:06:45.9035073Z torch: 1.13.0a0+gitce6a3c6 2022-08-17T14:06:45.9035387Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-08-17T14:06:45.9035889Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0 2022-08-17T14:06:45.9341317Z collecting ...  2022-08-17T14:06:45.9341999Z collected 6 items  2022-08-17T14:06:45.9346598Z 2022-08-17T14:06:47.1377442Z distributed/pipeline/sync/test_dependency.py::test_fork_join PASSED [ 16%] 2022-08-17T14:06:47.1391571Z distributed/pipeline/sync/test_dependency.py::test_fork_join_enable_grad PASSED [ 33%] 2022-08-17T14:06:47.1406544Z distributed/pipeline/sync/test_dependency.py::test_fork_join_no_grad PASSED [ 50%] 2022-08-17T14:06:47.1422447Z distributed/pipeline/sync/test_dependency.py::test_fork_leak PASSED [ 66%] 2022-08-17T14:06:47.1436520Z distributed/pipeline/sync/test_dependency.py::test_join_when_fork_not_requires_grad PASSED [ 83%] 2022-08-17T14:06:47.1453922Z distributed/pipeline/sync/test_dependency.py::test_join_when_fork_requires_grad PASSED [100%] 2022-08-17T14:06:47.1455693Z 2022-08-17T14:06:47.1456024Z ============================== 6 passed in 1.25s =============================== 2022-08-17T14:06:47.4512336Z Running distributed/pipeline/sync/test_microbatch ... [2022-08-17 14:06:47.450741] 2022-08-17T14:06:47.4512997Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_microbatch.py', '-v'] ... [2022-08-17 14:06:47.450815] 2022-08-17T14:06:49.2378024Z ============================= test session starts ============================== 2022-08-17T14:06:49.2378629Z platform linux -- Python 3.10.4, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python 2022-08-17T14:06:49.2449252Z cachedir: .pytest_cache 2022-08-17T14:06:49.2450162Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-08-17T14:06:49.2450618Z torch: 1.13.0a0+gitce6a3c6 2022-08-17T14:06:49.2450960Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-08-17T14:06:49.2451464Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0 2022-08-17T14:06:49.2783779Z collecting ...  2022-08-17T14:06:49.2784356Z collected 10 items  2022-08-17T14:06:49.2789067Z 2022-08-17T14:06:49.2821413Z distributed/pipeline/sync/test_microbatch.py::test_batch_atomic PASSED [ 10%] 2022-08-17T14:06:49.2838391Z distributed/pipeline/sync/test_microbatch.py::test_batch_non_atomic PASSED [ 20%] 2022-08-17T14:06:49.2854857Z distributed/pipeline/sync/test_microbatch.py::test_batch_call PASSED [ 30%] 2022-08-17T14:06:49.2872397Z distributed/pipeline/sync/test_microbatch.py::test_batch_setitem_by_index PASSED [ 40%] 2022-08-17T14:06:49.2889281Z distributed/pipeline/sync/test_microbatch.py::test_batch_setitem_by_slice PASSED [ 50%] 2022-08-17T14:06:49.2909994Z distributed/pipeline/sync/test_microbatch.py::test_check PASSED [ 60%] 2022-08-17T14:06:49.3124049Z distributed/pipeline/sync/test_microbatch.py::test_gather_tensors PASSED [ 70%] 2022-08-17T14:06:49.3140996Z distributed/pipeline/sync/test_microbatch.py::test_gather_tuples PASSED [ 80%] 2022-08-17T14:06:49.3158487Z distributed/pipeline/sync/test_microbatch.py::test_scatter_tensor PASSED [ 90%] 2022-08-17T14:06:49.3178610Z distributed/pipeline/sync/test_microbatch.py::test_scatter_multiple_tensors PASSED [100%] 2022-08-17T14:06:49.3179748Z 2022-08-17T14:06:49.3180388Z ============================== 10 passed in 0.08s ============================== 2022-08-17T14:06:49.5651799Z Running distributed/pipeline/sync/test_pipe ... [2022-08-17 14:06:49.564753] 2022-08-17T14:06:49.5652466Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_pipe.py', '-v'] ... [2022-08-17 14:06:49.564830] 2022-08-17T14:06:51.3396496Z ============================= test session starts ============================== 2022-08-17T14:06:51.3397545Z platform linux -- Python 3.10.4, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python 2022-08-17T14:06:51.3469202Z cachedir: .pytest_cache 2022-08-17T14:06:51.3470471Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-08-17T14:06:51.3471829Z torch: 1.13.0a0+gitce6a3c6 2022-08-17T14:06:51.3472457Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-08-17T14:06:51.3473474Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0 2022-08-17T14:06:51.4763851Z collecting ...  2022-08-17T14:06:51.4764308Z collected 56 items  2022-08-17T14:06:51.4768462Z 2022-08-17T14:06:51.4811914Z distributed/pipeline/sync/test_pipe.py::test_pipe_without_rpc PASSED [ 1%] 2022-08-17T14:06:51.5683731Z distributed/pipeline/sync/test_pipe.py::test_parameters PASSED [ 3%] 2022-08-17T14:06:51.5838613Z distributed/pipeline/sync/test_pipe.py::test_public_attrs PASSED [ 5%] 2022-08-17T14:06:51.5995307Z distributed/pipeline/sync/test_pipe.py::test_sequential_like PASSED [ 7%] 2022-08-17T14:06:51.6256646Z distributed/pipeline/sync/test_pipe.py::test_chunks_less_than_1 PASSED [ 8%] 2022-08-17T14:06:51.6435231Z distributed/pipeline/sync/test_pipe.py::test_batch_size_indivisible PASSED [ 10%] 2022-08-17T14:06:51.6602853Z distributed/pipeline/sync/test_pipe.py::test_batch_size_small PASSED [ 12%] 2022-08-17T14:06:51.6790085Z distributed/pipeline/sync/test_pipe.py::test_checkpoint_mode PASSED [ 14%] 2022-08-17T14:06:51.6942695Z distributed/pipeline/sync/test_pipe.py::test_checkpoint_mode_invalid PASSED [ 16%] 2022-08-17T14:06:51.7105169Z distributed/pipeline/sync/test_pipe.py::test_checkpoint_mode_when_chunks_1 PASSED [ 17%] 2022-08-17T14:06:51.7378291Z distributed/pipeline/sync/test_pipe.py::test_checkpoint_eval PASSED [ 19%] 2022-08-17T14:06:51.7558695Z distributed/pipeline/sync/test_pipe.py::test_checkpoint_non_float_input PASSED [ 21%] 2022-08-17T14:06:51.7822700Z distributed/pipeline/sync/test_pipe.py::test_no_grad PASSED [ 23%] 2022-08-17T14:06:51.7979493Z distributed/pipeline/sync/test_pipe.py::test_exception PASSED [ 25%] 2022-08-17T14:06:52.0176955Z distributed/pipeline/sync/test_pipe.py::test_exception_early_stop_asap PASSED [ 26%] 2022-08-17T14:06:52.0365848Z distributed/pipeline/sync/test_pipe.py::test_nested_input PASSED [ 28%] 2022-08-17T14:06:52.0544302Z distributed/pipeline/sync/test_pipe.py::test_input_pair PASSED [ 30%] 2022-08-17T14:06:52.0810407Z distributed/pipeline/sync/test_pipe.py::test_multi_sequence_input PASSED [ 32%] 2022-08-17T14:06:52.0981349Z distributed/pipeline/sync/test_pipe.py::test_input_singleton PASSED [ 33%] 2022-08-17T14:06:52.1138226Z distributed/pipeline/sync/test_pipe.py::test_input_varargs PASSED [ 35%] 2022-08-17T14:06:52.1294388Z distributed/pipeline/sync/test_pipe.py::test_non_tensor PASSED [ 37%] 2022-08-17T14:06:52.1458831Z distributed/pipeline/sync/test_pipe.py::test_non_tensor_sequence PASSED [ 39%] 2022-08-17T14:06:52.1691139Z distributed/pipeline/sync/test_pipe.py::test_valid_non_tensor[never] PASSED [ 41%] 2022-08-17T14:06:52.1966863Z distributed/pipeline/sync/test_pipe.py::test_valid_non_tensor[always] PASSED [ 42%] 2022-08-17T14:06:52.2236883Z distributed/pipeline/sync/test_pipe.py::test_valid_non_tensor[except_last] PASSED [ 44%] 2022-08-17T14:06:52.2396215Z distributed/pipeline/sync/test_pipe.py::test_no_tensor_output[never] PASSED [ 46%] 2022-08-17T14:06:52.2555439Z distributed/pipeline/sync/test_pipe.py::test_no_tensor_output[always] PASSED [ 48%] 2022-08-17T14:06:52.2715334Z distributed/pipeline/sync/test_pipe.py::test_no_tensor_output[except_last] PASSED [ 50%] 2022-08-17T14:06:52.2884967Z distributed/pipeline/sync/test_pipe.py::test_uneven_batch_size[never] PASSED [ 51%] 2022-08-17T14:06:52.3062430Z distributed/pipeline/sync/test_pipe.py::test_uneven_batch_size[always] PASSED [ 53%] 2022-08-17T14:06:52.3241977Z distributed/pipeline/sync/test_pipe.py::test_uneven_batch_size[except_last] PASSED [ 55%] 2022-08-17T14:06:52.3411235Z distributed/pipeline/sync/test_pipe.py::test_no_chunk[never] PASSED [ 57%] 2022-08-17T14:06:52.3592668Z distributed/pipeline/sync/test_pipe.py::test_no_chunk[always] PASSED [ 58%] 2022-08-17T14:06:52.3767988Z distributed/pipeline/sync/test_pipe.py::test_no_chunk[except_last] PASSED [ 60%] 2022-08-17T14:06:52.4025345Z distributed/pipeline/sync/test_pipe.py::test_deferred_batch_norm[never] PASSED [ 62%] 2022-08-17T14:06:52.4257866Z distributed/pipeline/sync/test_pipe.py::test_deferred_batch_norm[always] PASSED [ 64%] 2022-08-17T14:06:52.4594340Z distributed/pipeline/sync/test_pipe.py::test_deferred_batch_norm[except_last] PASSED [ 66%] 2022-08-17T14:06:52.4892840Z distributed/pipeline/sync/test_pipe.py::test_deferred_batch_norm_params[never] PASSED [ 67%] 2022-08-17T14:06:52.5099902Z distributed/pipeline/sync/test_pipe.py::test_deferred_batch_norm_params[always] PASSED [ 69%] 2022-08-17T14:06:52.5360039Z distributed/pipeline/sync/test_pipe.py::test_devices PASSED [ 71%] 2022-08-17T14:06:52.5617150Z distributed/pipeline/sync/test_pipe.py::test_partitions PASSED [ 73%] 2022-08-17T14:06:53.7930112Z distributed/pipeline/sync/test_pipe.py::test_merged_partitions PASSED [ 75%] 2022-08-17T14:06:53.8188122Z distributed/pipeline/sync/test_pipe.py::test_deny_moving PASSED [ 76%] 2022-08-17T14:06:53.8336825Z distributed/pipeline/sync/test_pipe.py::test_empty_module PASSED [ 78%] 2022-08-17T14:06:53.8488643Z distributed/pipeline/sync/test_pipe.py::test_named_children PASSED [ 80%] 2022-08-17T14:06:53.8635853Z distributed/pipeline/sync/test_pipe.py::test_verify_module_non_sequential PASSED [ 82%] 2022-08-17T14:06:53.8788297Z distributed/pipeline/sync/test_pipe.py::test_verify_module_duplicate_children PASSED [ 83%] 2022-08-17T14:06:53.8942221Z distributed/pipeline/sync/test_pipe.py::test_verify_module_params_on_same_device PASSED [ 85%] 2022-08-17T14:06:55.5956170Z distributed/pipeline/sync/test_pipe.py::test_verify_nested_modules PASSED [ 87%] 2022-08-17T14:06:55.6111494Z distributed/pipeline/sync/test_pipe.py::test_verify_module_duplicate_parameters_on_same_device PASSED [ 89%] 2022-08-17T14:06:55.9301185Z distributed/pipeline/sync/test_pipe.py::test_forward_lockstep PASSED [ 91%] 2022-08-17T14:06:55.9473924Z distributed/pipeline/sync/test_pipe.py::test_multiple_inputs[never] PASSED [ 92%] 2022-08-17T14:06:55.9648184Z distributed/pipeline/sync/test_pipe.py::test_multiple_inputs[always] PASSED [ 94%] 2022-08-17T14:06:55.9819779Z distributed/pipeline/sync/test_pipe.py::test_multiple_inputs[except_last] PASSED [ 96%] 2022-08-17T14:06:55.9982747Z distributed/pipeline/sync/test_pipe.py::test_inputs_wrong_device PASSED [ 98%] 2022-08-17T14:06:56.0495764Z distributed/pipeline/sync/test_pipe.py::test_with_device_wrapper PASSED [100%] 2022-08-17T14:06:56.0496615Z 2022-08-17T14:06:56.0496953Z ============================== 56 passed in 4.71s ============================== 2022-08-17T14:06:56.4429612Z Running distributed/pipeline/sync/test_stream ... [2022-08-17 14:06:56.442501] 2022-08-17T14:06:56.4430266Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_stream.py', '-v'] ... [2022-08-17 14:06:56.442575] 2022-08-17T14:06:58.2807535Z ============================= test session starts ============================== 2022-08-17T14:06:58.2808163Z platform linux -- Python 3.10.4, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python 2022-08-17T14:06:58.2878670Z cachedir: .pytest_cache 2022-08-17T14:06:58.2879253Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-08-17T14:06:58.2879943Z torch: 1.13.0a0+gitce6a3c6 2022-08-17T14:06:58.2880301Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-08-17T14:06:58.2880810Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0 2022-08-17T14:06:58.3432534Z collecting ...  2022-08-17T14:06:58.3432961Z collected 19 items  2022-08-17T14:06:58.3436871Z 2022-08-17T14:06:58.3467536Z distributed/pipeline/sync/test_stream.py::TestNewStream::test_new_stream_cpu PASSED [ 5%] 2022-08-17T14:06:59.5474851Z distributed/pipeline/sync/test_stream.py::TestNewStream::test_new_stream_cuda PASSED [ 10%] 2022-08-17T14:06:59.5487492Z distributed/pipeline/sync/test_stream.py::TestCurrentStream::test_current_stream_cpu PASSED [ 15%] 2022-08-17T14:06:59.5500221Z distributed/pipeline/sync/test_stream.py::TestCurrentStream::test_current_stream_cuda PASSED [ 21%] 2022-08-17T14:06:59.5512832Z distributed/pipeline/sync/test_stream.py::TestDefaultStream::test_default_stream_cpu PASSED [ 26%] 2022-08-17T14:06:59.5525671Z distributed/pipeline/sync/test_stream.py::TestDefaultStream::test_default_stream_cuda PASSED [ 31%] 2022-08-17T14:06:59.5538038Z distributed/pipeline/sync/test_stream.py::TestUseDevice::test_use_device_cpu PASSED [ 36%] 2022-08-17T14:06:59.5550931Z distributed/pipeline/sync/test_stream.py::TestUseDevice::test_use_device_cuda PASSED [ 42%] 2022-08-17T14:06:59.5563098Z distributed/pipeline/sync/test_stream.py::TestUseStream::test_use_stream_cpu PASSED [ 47%] 2022-08-17T14:06:59.5576124Z distributed/pipeline/sync/test_stream.py::TestUseStream::test_use_stream_cuda PASSED [ 52%] 2022-08-17T14:06:59.5588655Z distributed/pipeline/sync/test_stream.py::TestGetDevice::test_get_device_cpu PASSED [ 57%] 2022-08-17T14:06:59.5601294Z distributed/pipeline/sync/test_stream.py::TestGetDevice::test_get_device_cuda PASSED [ 63%] 2022-08-17T14:06:59.5805609Z distributed/pipeline/sync/test_stream.py::TestWaitStream::test_wait_stream_cpu_cpu PASSED [ 68%] 2022-08-17T14:07:00.0665695Z distributed/pipeline/sync/test_stream.py::TestWaitStream::test_wait_stream_cpu_cuda PASSED [ 73%] 2022-08-17T14:07:00.0681418Z distributed/pipeline/sync/test_stream.py::TestWaitStream::test_wait_stream_cuda_cpu PASSED [ 78%] 2022-08-17T14:07:00.5525954Z distributed/pipeline/sync/test_stream.py::TestWaitStream::test_wait_stream_cuda_cuda PASSED [ 84%] 2022-08-17T14:07:00.5539486Z distributed/pipeline/sync/test_stream.py::TestRecordStream::test_record_stream_cpu PASSED [ 89%] 2022-08-17T14:07:01.0391804Z distributed/pipeline/sync/test_stream.py::TestRecordStream::test_record_stream_cuda PASSED [ 94%] 2022-08-17T14:07:01.0417178Z distributed/pipeline/sync/test_stream.py::TestRecordStream::test_record_stream_shifted_view PASSED [100%] 2022-08-17T14:07:01.0418743Z 2022-08-17T14:07:01.0419097Z ============================== 19 passed in 2.76s ============================== 2022-08-17T14:07:01.5490769Z Running distributed/pipeline/sync/test_worker ... [2022-08-17 14:07:01.548631] 2022-08-17T14:07:01.5491404Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_worker.py', '-v'] ... [2022-08-17 14:07:01.548713] 2022-08-17T14:07:03.3920961Z ============================= test session starts ============================== 2022-08-17T14:07:03.3921550Z platform linux -- Python 3.10.4, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python 2022-08-17T14:07:03.3992229Z cachedir: .pytest_cache 2022-08-17T14:07:03.3992839Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-08-17T14:07:03.3993281Z torch: 1.13.0a0+gitce6a3c6 2022-08-17T14:07:03.3993599Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-08-17T14:07:03.3994099Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0 2022-08-17T14:07:03.4220447Z collecting ...  2022-08-17T14:07:03.4220863Z collected 6 items  2022-08-17T14:07:03.4225062Z 2022-08-17T14:07:03.4262935Z distributed/pipeline/sync/test_worker.py::test_compute_multithreading PASSED [ 16%] 2022-08-17T14:07:03.4286989Z distributed/pipeline/sync/test_worker.py::test_compute_success PASSED [ 33%] 2022-08-17T14:07:03.4306630Z distributed/pipeline/sync/test_worker.py::test_compute_exception PASSED [ 50%] 2022-08-17T14:07:03.4523503Z distributed/pipeline/sync/test_worker.py::test_grad_mode[True] PASSED [ 66%] 2022-08-17T14:07:03.4544655Z distributed/pipeline/sync/test_worker.py::test_grad_mode[False] PASSED [ 83%] 2022-08-17T14:07:03.4571907Z distributed/pipeline/sync/test_worker.py::test_worker_per_device PASSED [100%] 2022-08-17T14:07:03.4573004Z 2022-08-17T14:07:03.4573324Z ============================== 6 passed in 0.07s =============================== 2022-08-17T14:07:03.6945371Z Running distributed/rpc/test_tensorpipe_agent ... [2022-08-17 14:07:03.694013] 2022-08-17T14:07:03.6946160Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/rpc/test_tensorpipe_agent.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-08-17 14:07:03.694090] 2022-08-17T14:07:05.2195614Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3fl0l85g 2022-08-17T14:07:05.2196701Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3fl0l85g/_remote_module_non_scriptable.py 2022-08-17T14:07:06.1278597Z 2022-08-17T14:07:06.1278892Z real 86m56.349s 2022-08-17T14:07:06.1279188Z user 159m42.095s 2022-08-17T14:07:06.1279428Z sys 99m39.642s 2022-08-17T14:07:06.1281822Z + assert_git_not_dirty 2022-08-17T14:07:06.1282389Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 != *rocm* ]] 2022-08-17T14:07:06.1282836Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 != *xla* ]] 2022-08-17T14:07:06.1283194Z ++ git status --porcelain 2022-08-17T14:07:06.8554661Z + git_status= 2022-08-17T14:07:06.8555069Z + [[ -n '' ]] 2022-08-17T14:07:06.8555475Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *cuda* ]] 2022-08-17T14:07:06.8555781Z + [[ 2 == 1 ]] 2022-08-17T14:07:06.8556002Z + [[ 2 == 1 ]] 2022-08-17T14:07:06.8624845Z Prepare all required actions 2022-08-17T14:07:06.8625276Z Getting action download info 2022-08-17T14:07:07.0815900Z ##[group]Run ./.github/actions/get-workflow-job-id 2022-08-17T14:07:07.0816196Z with: 2022-08-17T14:07:07.0816645Z github-token: *** 2022-08-17T14:07:07.0816867Z env: 2022-08-17T14:07:07.0817105Z GIT_DEFAULT_BRANCH: master 2022-08-17T14:07:07.0817371Z GPU_FLAG: --gpus all 2022-08-17T14:07:07.0817599Z ##[endgroup] 2022-08-17T14:07:07.0850716Z ##[group]Run nick-fields/retry@71062288b76e2b6214ebde0e673ce0de1755740a 2022-08-17T14:07:07.0851029Z with: 2022-08-17T14:07:07.0851234Z shell: bash 2022-08-17T14:07:07.0851479Z timeout_minutes: 10 2022-08-17T14:07:07.0851726Z max_attempts: 5 2022-08-17T14:07:07.0851972Z retry_wait_seconds: 30 2022-08-17T14:07:07.0852505Z command: set -eux python3 -m pip install requests==2.26.0 GHA_WORKFLOW_JOB_ID=$(python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}") echo "::set-output name=job-id::${GHA_WORKFLOW_JOB_ID}" 2022-08-17T14:07:07.0853144Z polling_interval_seconds: 1 2022-08-17T14:07:07.0853399Z warning_on_retry: true 2022-08-17T14:07:07.0853659Z continue_on_error: false 2022-08-17T14:07:07.0853903Z env: 2022-08-17T14:07:07.0854123Z GIT_DEFAULT_BRANCH: master 2022-08-17T14:07:07.0854387Z GPU_FLAG: --gpus all 2022-08-17T14:07:07.0854784Z GITHUB_TOKEN: *** 2022-08-17T14:07:07.0855033Z ##[endgroup] 2022-08-17T14:07:07.1289862Z 2022-08-17T14:07:07.1364777Z + python3 -m pip install requests==2.26.0 2022-08-17T14:07:07.4321600Z Defaulting to user installation because normal site-packages is not writeable 2022-08-17T14:07:07.5681441Z Collecting requests==2.26.0 2022-08-17T14:07:07.5873564Z Downloading requests-2.26.0-py2.py3-none-any.whl (62 kB) 2022-08-17T14:07:07.6873027Z Collecting charset-normalizer~=2.0.0; python_version >= "3" 2022-08-17T14:07:07.6918410Z Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB) 2022-08-17T14:07:07.7944391Z Collecting urllib3<1.27,>=1.21.1 2022-08-17T14:07:07.7989882Z Downloading urllib3-1.26.11-py2.py3-none-any.whl (139 kB) 2022-08-17T14:07:07.8964106Z Collecting certifi>=2017.4.17 2022-08-17T14:07:07.9009346Z Downloading certifi-2022.6.15-py3-none-any.whl (160 kB) 2022-08-17T14:07:07.9446092Z Collecting idna<4,>=2.5; python_version >= "3" 2022-08-17T14:07:07.9492165Z Downloading idna-3.3-py3-none-any.whl (61 kB) 2022-08-17T14:07:08.0380116Z Installing collected packages: charset-normalizer, urllib3, certifi, idna, requests 2022-08-17T14:07:08.0678925Z WARNING: The script normalizer is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-08-17T14:07:08.0679573Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-08-17T14:07:08.3123409Z Successfully installed certifi-2022.6.15 charset-normalizer-2.0.12 idna-3.3 requests-2.26.0 urllib3-1.26.11 2022-08-17T14:07:08.3609783Z ++ python3 .github/scripts/get_workflow_job_id.py 2875102080 i-02fdd1ace63d4e018 2022-08-17T14:07:10.0643533Z + GHA_WORKFLOW_JOB_ID=7878561046 2022-08-17T14:07:10.0644829Z + echo '::set-output name=job-id::7878561046' 2022-08-17T14:07:10.1375463Z Command completed after 1 attempt(s). 2022-08-17T14:07:10.1375924Z 2022-08-17T14:07:10.1502821Z ##[group]Run kill "$MONITOR_SCRIPT_PID" 2022-08-17T14:07:10.1503147Z kill "$MONITOR_SCRIPT_PID" 2022-08-17T14:07:10.1516778Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-08-17T14:07:10.1517077Z env: 2022-08-17T14:07:10.1517314Z GIT_DEFAULT_BRANCH: master 2022-08-17T14:07:10.1517564Z GPU_FLAG: --gpus all 2022-08-17T14:07:10.1517825Z MONITOR_SCRIPT_PID: 56573 2022-08-17T14:07:10.1518079Z ##[endgroup] 2022-08-17T14:07:10.1610474Z Prepare all required actions 2022-08-17T14:07:10.1610844Z Getting action download info 2022-08-17T14:07:10.3559281Z Download action repository 'actions/upload-artifact@v2' (SHA:82c141cc518b40d92cc801eee768e7aafc9c2fa2) 2022-08-17T14:07:10.5175528Z ##[group]Run ./.github/actions/upload-test-artifacts 2022-08-17T14:07:10.5175826Z with: 2022-08-17T14:07:10.5176181Z file-suffix: test-distributed-2-2-linux.8xlarge.nvidia.gpu_7878561046 2022-08-17T14:07:10.5176508Z env: 2022-08-17T14:07:10.5176749Z GIT_DEFAULT_BRANCH: master 2022-08-17T14:07:10.5177019Z GPU_FLAG: --gpus all 2022-08-17T14:07:10.5177249Z ##[endgroup] 2022-08-17T14:07:10.5209080Z ##[group]Run # Remove any previous test jsons if they exist 2022-08-17T14:07:10.5209459Z # Remove any previous test jsons if they exist 2022-08-17T14:07:10.5209776Z rm -f test-jsons-*.zip 2022-08-17T14:07:10.5210092Z zip -r "test-jsons-${FILE_SUFFIX}.zip" test -i '*.json' 2022-08-17T14:07:10.5222460Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-08-17T14:07:10.5222756Z env: 2022-08-17T14:07:10.5222982Z GIT_DEFAULT_BRANCH: master 2022-08-17T14:07:10.5223851Z GPU_FLAG: --gpus all 2022-08-17T14:07:10.5224357Z FILE_SUFFIX: test-distributed-2-2-linux.8xlarge.nvidia.gpu_7878561046 2022-08-17T14:07:10.5224842Z ##[endgroup] 2022-08-17T14:07:10.5396645Z adding: test/allowlist_for_publicAPI.json (deflated 80%) 2022-08-17T14:07:10.5431989Z adding: test/benchmark_utils/callgrind_artifacts.json (deflated 92%) 2022-08-17T14:07:10.5439213Z adding: test/profiler_utils_mock_events.json (deflated 87%) 2022-08-17T14:07:10.5440506Z adding: test/.pytorch-slow-tests.json (deflated 75%) 2022-08-17T14:07:10.5446383Z adding: test/.pytorch-disabled-tests.json (deflated 85%) 2022-08-17T14:07:10.5471222Z ##[group]Run # Remove any previous test reports if they exist 2022-08-17T14:07:10.5471613Z # Remove any previous test reports if they exist 2022-08-17T14:07:10.5471948Z rm -f test-reports-*.zip 2022-08-17T14:07:10.5472278Z zip -r "test-reports-${FILE_SUFFIX}.zip" test -i '*.xml' 2022-08-17T14:07:10.5484093Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-08-17T14:07:10.5484392Z env: 2022-08-17T14:07:10.5484642Z GIT_DEFAULT_BRANCH: master 2022-08-17T14:07:10.5484892Z GPU_FLAG: --gpus all 2022-08-17T14:07:10.5485262Z FILE_SUFFIX: test-distributed-2-2-linux.8xlarge.nvidia.gpu_7878561046 2022-08-17T14:07:10.5485615Z ##[endgroup] 2022-08-17T14:07:10.5605449Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_embedding/TEST-TestShardedEmbedding-20220817124013.xml (deflated 60%) 2022-08-17T14:07:10.5606878Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDdpComparisonTest-20220817124023.xml (deflated 41%) 2022-08-17T14:07:10.5608220Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20220817124029.xml (deflated 41%) 2022-08-17T14:07:10.5609384Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20220817124038.xml (deflated 40%) 2022-08-17T14:07:10.5610748Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20220817124046.xml (deflated 41%) 2022-08-17T14:07:10.5611951Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20220817124055.xml (deflated 40%) 2022-08-17T14:07:10.5613167Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20220817124102.xml (deflated 41%) 2022-08-17T14:07:10.5614303Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20220817124110.xml (deflated 41%) 2022-08-17T14:07:10.5615722Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20220817124117.xml (deflated 41%) 2022-08-17T14:07:10.5616982Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRpcTest-20220817124125.xml (deflated 40%) 2022-08-17T14:07:10.5618118Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220817124135.xml (deflated 40%) 2022-08-17T14:07:10.5619362Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220817124141.xml (deflated 40%) 2022-08-17T14:07:10.5620806Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220817124147.xml (deflated 40%) 2022-08-17T14:07:10.5622188Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220817124154.xml (deflated 40%) 2022-08-17T14:07:10.5624019Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220817124200.xml (deflated 40%) 2022-08-17T14:07:10.5625327Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220817124206.xml (deflated 40%) 2022-08-17T14:07:10.5626850Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220817124212.xml (deflated 40%) 2022-08-17T14:07:10.5628386Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220817124218.xml (deflated 40%) 2022-08-17T14:07:10.5629885Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124225.xml (deflated 42%) 2022-08-17T14:07:10.5631509Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124236.xml (deflated 42%) 2022-08-17T14:07:10.5633102Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124249.xml (deflated 42%) 2022-08-17T14:07:10.5634694Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124302.xml (deflated 43%) 2022-08-17T14:07:10.5636314Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124314.xml (deflated 43%) 2022-08-17T14:07:10.5637870Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124325.xml (deflated 43%) 2022-08-17T14:07:10.5639458Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124337.xml (deflated 43%) 2022-08-17T14:07:10.5640994Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124348.xml (deflated 43%) 2022-08-17T14:07:10.5642517Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124400.xml (deflated 43%) 2022-08-17T14:07:10.5644033Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124411.xml (deflated 43%) 2022-08-17T14:07:10.5645559Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124417.xml (deflated 43%) 2022-08-17T14:07:10.5647128Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124424.xml (deflated 43%) 2022-08-17T14:07:10.5648759Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124430.xml (deflated 43%) 2022-08-17T14:07:10.5650349Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124436.xml (deflated 43%) 2022-08-17T14:07:10.5651946Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124444.xml (deflated 43%) 2022-08-17T14:07:10.5653663Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124452.xml (deflated 42%) 2022-08-17T14:07:10.5655129Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124504.xml (deflated 43%) 2022-08-17T14:07:10.5656671Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124508.xml (deflated 43%) 2022-08-17T14:07:10.5658296Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124528.xml (deflated 43%) 2022-08-17T14:07:10.5660017Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124544.xml (deflated 42%) 2022-08-17T14:07:10.5661649Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124557.xml (deflated 42%) 2022-08-17T14:07:10.5663549Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124604.xml (deflated 42%) 2022-08-17T14:07:10.5665099Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124614.xml (deflated 42%) 2022-08-17T14:07:10.5666700Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124623.xml (deflated 42%) 2022-08-17T14:07:10.5668196Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124633.xml (deflated 43%) 2022-08-17T14:07:10.5669659Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124637.xml (deflated 42%) 2022-08-17T14:07:10.5671081Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124648.xml (deflated 42%) 2022-08-17T14:07:10.5672589Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124700.xml (deflated 42%) 2022-08-17T14:07:10.5674128Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124712.xml (deflated 42%) 2022-08-17T14:07:10.5675669Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124724.xml (deflated 42%) 2022-08-17T14:07:10.5677204Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124736.xml (deflated 42%) 2022-08-17T14:07:10.5678778Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124748.xml (deflated 42%) 2022-08-17T14:07:10.5680259Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124759.xml (deflated 42%) 2022-08-17T14:07:10.5681776Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124811.xml (deflated 42%) 2022-08-17T14:07:10.5683336Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124823.xml (deflated 42%) 2022-08-17T14:07:10.5684935Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124835.xml (deflated 42%) 2022-08-17T14:07:10.5686536Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124847.xml (deflated 42%) 2022-08-17T14:07:10.5688078Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124859.xml (deflated 42%) 2022-08-17T14:07:10.5689701Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124911.xml (deflated 42%) 2022-08-17T14:07:10.5691286Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124922.xml (deflated 42%) 2022-08-17T14:07:10.5692838Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124934.xml (deflated 42%) 2022-08-17T14:07:10.5694567Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124946.xml (deflated 42%) 2022-08-17T14:07:10.5696083Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817124956.xml (deflated 43%) 2022-08-17T14:07:10.5697678Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125008.xml (deflated 42%) 2022-08-17T14:07:10.5699231Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125017.xml (deflated 42%) 2022-08-17T14:07:10.5700854Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125027.xml (deflated 42%) 2022-08-17T14:07:10.5702484Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125039.xml (deflated 42%) 2022-08-17T14:07:10.5704442Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125050.xml (deflated 43%) 2022-08-17T14:07:10.5706036Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125057.xml (deflated 43%) 2022-08-17T14:07:10.5707635Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125104.xml (deflated 43%) 2022-08-17T14:07:10.5709246Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125110.xml (deflated 42%) 2022-08-17T14:07:10.5710795Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125116.xml (deflated 42%) 2022-08-17T14:07:10.5712358Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125125.xml (deflated 42%) 2022-08-17T14:07:10.5713884Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125133.xml (deflated 42%) 2022-08-17T14:07:10.5715499Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125142.xml (deflated 42%) 2022-08-17T14:07:10.5717063Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125150.xml (deflated 42%) 2022-08-17T14:07:10.5718694Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125159.xml (deflated 42%) 2022-08-17T14:07:10.5720182Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125207.xml (deflated 42%) 2022-08-17T14:07:10.5721686Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125216.xml (deflated 42%) 2022-08-17T14:07:10.5723232Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125228.xml (deflated 42%) 2022-08-17T14:07:10.5724824Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125239.xml (deflated 43%) 2022-08-17T14:07:10.5726400Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125246.xml (deflated 43%) 2022-08-17T14:07:10.5728061Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125258.xml (deflated 43%) 2022-08-17T14:07:10.5729667Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125304.xml (deflated 43%) 2022-08-17T14:07:10.5731285Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125310.xml (deflated 42%) 2022-08-17T14:07:10.5732942Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125317.xml (deflated 42%) 2022-08-17T14:07:10.5734641Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125326.xml (deflated 42%) 2022-08-17T14:07:10.5736321Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125333.xml (deflated 42%) 2022-08-17T14:07:10.5737964Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125340.xml (deflated 42%) 2022-08-17T14:07:10.5739599Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125352.xml (deflated 42%) 2022-08-17T14:07:10.5741295Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125406.xml (deflated 42%) 2022-08-17T14:07:10.5742971Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125420.xml (deflated 41%) 2022-08-17T14:07:10.5745063Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125432.xml (deflated 42%) 2022-08-17T14:07:10.5746691Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125452.xml (deflated 42%) 2022-08-17T14:07:10.5748388Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125514.xml (deflated 42%) 2022-08-17T14:07:10.5750026Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125518.xml (deflated 42%) 2022-08-17T14:07:10.5751702Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125540.xml (deflated 42%) 2022-08-17T14:07:10.5753317Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125559.xml (deflated 42%) 2022-08-17T14:07:10.5754941Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125618.xml (deflated 42%) 2022-08-17T14:07:10.5756476Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125636.xml (deflated 42%) 2022-08-17T14:07:10.5757888Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125655.xml (deflated 42%) 2022-08-17T14:07:10.5759060Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125713.xml (deflated 42%) 2022-08-17T14:07:10.5760007Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125733.xml (deflated 41%) 2022-08-17T14:07:10.5761341Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125755.xml (deflated 42%) 2022-08-17T14:07:10.5762818Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125814.xml (deflated 41%) 2022-08-17T14:07:10.5763775Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125836.xml (deflated 42%) 2022-08-17T14:07:10.5764961Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220817125846.xml (deflated 43%) 2022-08-17T14:07:10.5766154Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20220817125857.xml (deflated 44%) 2022-08-17T14:07:10.5767144Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20220817125904.xml (deflated 44%) 2022-08-17T14:07:10.5768134Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20220817125910.xml (deflated 43%) 2022-08-17T14:07:10.5769217Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817125918.xml (deflated 38%) 2022-08-17T14:07:10.5769870Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817125926.xml (deflated 38%) 2022-08-17T14:07:10.5770539Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817125933.xml (deflated 39%) 2022-08-17T14:07:10.5771418Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817125939.xml (deflated 38%) 2022-08-17T14:07:10.5772234Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817125946.xml (deflated 37%) 2022-08-17T14:07:10.5773013Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817125951.xml (deflated 39%) 2022-08-17T14:07:10.5773700Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817125956.xml (deflated 39%) 2022-08-17T14:07:10.5774354Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130002.xml (deflated 39%) 2022-08-17T14:07:10.5775332Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130007.xml (deflated 38%) 2022-08-17T14:07:10.5776478Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130014.xml (deflated 37%) 2022-08-17T14:07:10.5777495Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130021.xml (deflated 37%) 2022-08-17T14:07:10.5778515Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130028.xml (deflated 38%) 2022-08-17T14:07:10.5779311Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130036.xml (deflated 38%) 2022-08-17T14:07:10.5780318Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130043.xml (deflated 39%) 2022-08-17T14:07:10.5781109Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130048.xml (deflated 38%) 2022-08-17T14:07:10.5781990Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-CommTest-20220817130055.xml (deflated 37%) 2022-08-17T14:07:10.5783121Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130102.xml (deflated 41%) 2022-08-17T14:07:10.5784820Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130110.xml (deflated 41%) 2022-08-17T14:07:10.5786021Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130119.xml (deflated 41%) 2022-08-17T14:07:10.5786896Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130127.xml (deflated 41%) 2022-08-17T14:07:10.5787888Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130136.xml (deflated 41%) 2022-08-17T14:07:10.5789129Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130143.xml (deflated 42%) 2022-08-17T14:07:10.5790385Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130151.xml (deflated 41%) 2022-08-17T14:07:10.5791672Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130159.xml (deflated 41%) 2022-08-17T14:07:10.5793016Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130207.xml (deflated 42%) 2022-08-17T14:07:10.5794344Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130215.xml (deflated 45%) 2022-08-17T14:07:10.5795611Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130222.xml (deflated 45%) 2022-08-17T14:07:10.5796711Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130230.xml (deflated 43%) 2022-08-17T14:07:10.5797669Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130237.xml (deflated 43%) 2022-08-17T14:07:10.5798759Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130245.xml (deflated 45%) 2022-08-17T14:07:10.5800105Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130252.xml (deflated 45%) 2022-08-17T14:07:10.5801402Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130300.xml (deflated 47%) 2022-08-17T14:07:10.5802737Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130307.xml (deflated 47%) 2022-08-17T14:07:10.5803782Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130314.xml (deflated 44%) 2022-08-17T14:07:10.5804690Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130322.xml (deflated 45%) 2022-08-17T14:07:10.5805906Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130329.xml (deflated 45%) 2022-08-17T14:07:10.5807159Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130336.xml (deflated 43%) 2022-08-17T14:07:10.5808209Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130344.xml (deflated 43%) 2022-08-17T14:07:10.5809515Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130351.xml (deflated 42%) 2022-08-17T14:07:10.5810675Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130359.xml (deflated 41%) 2022-08-17T14:07:10.5811879Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130407.xml (deflated 42%) 2022-08-17T14:07:10.5813192Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130415.xml (deflated 44%) 2022-08-17T14:07:10.5814294Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130423.xml (deflated 44%) 2022-08-17T14:07:10.5815540Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130431.xml (deflated 42%) 2022-08-17T14:07:10.5816718Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130436.xml (deflated 41%) 2022-08-17T14:07:10.5817918Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130444.xml (deflated 41%) 2022-08-17T14:07:10.5818962Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130450.xml (deflated 41%) 2022-08-17T14:07:10.5820141Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130458.xml (deflated 41%) 2022-08-17T14:07:10.5821313Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130506.xml (deflated 42%) 2022-08-17T14:07:10.5822266Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130514.xml (deflated 42%) 2022-08-17T14:07:10.5823641Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130517.xml (deflated 42%) 2022-08-17T14:07:10.5825056Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130521.xml (deflated 42%) 2022-08-17T14:07:10.5826124Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130524.xml (deflated 41%) 2022-08-17T14:07:10.5827468Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130528.xml (deflated 42%) 2022-08-17T14:07:10.5828642Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130531.xml (deflated 41%) 2022-08-17T14:07:10.5829928Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130535.xml (deflated 41%) 2022-08-17T14:07:10.5831188Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130543.xml (deflated 41%) 2022-08-17T14:07:10.5832527Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130551.xml (deflated 42%) 2022-08-17T14:07:10.5833812Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130559.xml (deflated 41%) 2022-08-17T14:07:10.5835295Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130607.xml (deflated 41%) 2022-08-17T14:07:10.5836578Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130617.xml (deflated 42%) 2022-08-17T14:07:10.5837933Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130622.xml (deflated 41%) 2022-08-17T14:07:10.5839228Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130627.xml (deflated 42%) 2022-08-17T14:07:10.5840229Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130636.xml (deflated 41%) 2022-08-17T14:07:10.5841020Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130644.xml (deflated 41%) 2022-08-17T14:07:10.5841818Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130653.xml (deflated 41%) 2022-08-17T14:07:10.5842595Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130701.xml (deflated 42%) 2022-08-17T14:07:10.5843482Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130706.xml (deflated 42%) 2022-08-17T14:07:10.5845133Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130712.xml (deflated 41%) 2022-08-17T14:07:10.5846247Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130718.xml (deflated 42%) 2022-08-17T14:07:10.5847018Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130724.xml (deflated 42%) 2022-08-17T14:07:10.5847815Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130732.xml (deflated 42%) 2022-08-17T14:07:10.5848601Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130741.xml (deflated 43%) 2022-08-17T14:07:10.5849388Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130804.xml (deflated 44%) 2022-08-17T14:07:10.5850163Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130813.xml (deflated 42%) 2022-08-17T14:07:10.5850945Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130821.xml (deflated 41%) 2022-08-17T14:07:10.5851723Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130826.xml (deflated 41%) 2022-08-17T14:07:10.5852515Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130834.xml (deflated 40%) 2022-08-17T14:07:10.5853301Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130842.xml (deflated 41%) 2022-08-17T14:07:10.5854070Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-DistributedDataParallelTest-20220817130851.xml (deflated 41%) 2022-08-17T14:07:10.5854836Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130900.xml (deflated 41%) 2022-08-17T14:07:10.5855595Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130905.xml (deflated 41%) 2022-08-17T14:07:10.5856346Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130911.xml (deflated 42%) 2022-08-17T14:07:10.5857078Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130913.xml (deflated 41%) 2022-08-17T14:07:10.5857899Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130918.xml (deflated 41%) 2022-08-17T14:07:10.5858663Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130923.xml (deflated 40%) 2022-08-17T14:07:10.5859413Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130929.xml (deflated 41%) 2022-08-17T14:07:10.5860144Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130934.xml (deflated 42%) 2022-08-17T14:07:10.5860888Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-NcclErrorHandlingTest-20220817130936.xml (deflated 41%) 2022-08-17T14:07:10.5861657Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLNoGPUTest-20220817130941.xml (deflated 41%) 2022-08-17T14:07:10.5862427Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817130943.xml (deflated 38%) 2022-08-17T14:07:10.5863159Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817130950.xml (deflated 38%) 2022-08-17T14:07:10.5864233Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817130958.xml (deflated 38%) 2022-08-17T14:07:10.5864984Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131006.xml (deflated 39%) 2022-08-17T14:07:10.5865729Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131014.xml (deflated 38%) 2022-08-17T14:07:10.5866458Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131022.xml (deflated 39%) 2022-08-17T14:07:10.5867200Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131030.xml (deflated 39%) 2022-08-17T14:07:10.5867951Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131038.xml (deflated 38%) 2022-08-17T14:07:10.5868691Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131044.xml (deflated 38%) 2022-08-17T14:07:10.5869422Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131052.xml (deflated 39%) 2022-08-17T14:07:10.5870155Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131104.xml (deflated 39%) 2022-08-17T14:07:10.5870893Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131112.xml (deflated 39%) 2022-08-17T14:07:10.5871630Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131119.xml (deflated 39%) 2022-08-17T14:07:10.5872354Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131127.xml (deflated 39%) 2022-08-17T14:07:10.5873091Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131135.xml (deflated 38%) 2022-08-17T14:07:10.5873865Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131141.xml (deflated 39%) 2022-08-17T14:07:10.5874606Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131149.xml (deflated 38%) 2022-08-17T14:07:10.5875324Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-ProcessGroupNCCLTest-20220817131201.xml (deflated 38%) 2022-08-17T14:07:10.5876054Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-RendezvousEnvTest-20220817131208.xml (deflated 40%) 2022-08-17T14:07:10.5876842Z adding: test/test-reports/python-unittest/distributed.test_c10d_nccl/TEST-TimeoutTest-20220817131212.xml (deflated 40%) 2022-08-17T14:07:10.5877533Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20220817131220.xml (deflated 39%) 2022-08-17T14:07:10.5878179Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20220817131225.xml (deflated 38%) 2022-08-17T14:07:10.5878847Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20220817131232.xml (deflated 38%) 2022-08-17T14:07:10.5879508Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20220817131238.xml (deflated 38%) 2022-08-17T14:07:10.5880163Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20220817131244.xml (deflated 38%) 2022-08-17T14:07:10.5880811Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20220817131251.xml (deflated 39%) 2022-08-17T14:07:10.5881471Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20220817131257.xml (deflated 38%) 2022-08-17T14:07:10.5882164Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20220817131302.xml (deflated 38%) 2022-08-17T14:07:10.5882903Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131307.xml (deflated 45%) 2022-08-17T14:07:10.5883777Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131315.xml (deflated 45%) 2022-08-17T14:07:10.5884574Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131322.xml (deflated 43%) 2022-08-17T14:07:10.5885372Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131330.xml (deflated 43%) 2022-08-17T14:07:10.5886170Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131337.xml (deflated 45%) 2022-08-17T14:07:10.5886950Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131345.xml (deflated 45%) 2022-08-17T14:07:10.5887734Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131352.xml (deflated 47%) 2022-08-17T14:07:10.5888528Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131359.xml (deflated 47%) 2022-08-17T14:07:10.5889319Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131407.xml (deflated 44%) 2022-08-17T14:07:10.5890089Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131414.xml (deflated 45%) 2022-08-17T14:07:10.5890868Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131421.xml (deflated 45%) 2022-08-17T14:07:10.5891663Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131429.xml (deflated 43%) 2022-08-17T14:07:10.5892455Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131436.xml (deflated 43%) 2022-08-17T14:07:10.5893232Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131444.xml (deflated 43%) 2022-08-17T14:07:10.5894013Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131449.xml (deflated 44%) 2022-08-17T14:07:10.5894801Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131456.xml (deflated 45%) 2022-08-17T14:07:10.5895584Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131502.xml (deflated 44%) 2022-08-17T14:07:10.5896407Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131507.xml (deflated 45%) 2022-08-17T14:07:10.5897199Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131512.xml (deflated 45%) 2022-08-17T14:07:10.5897992Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131518.xml (deflated 50%) 2022-08-17T14:07:10.5898777Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131525.xml (deflated 41%) 2022-08-17T14:07:10.5899543Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131532.xml (deflated 41%) 2022-08-17T14:07:10.5900329Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131539.xml (deflated 41%) 2022-08-17T14:07:10.5901112Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131545.xml (deflated 41%) 2022-08-17T14:07:10.5901896Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131553.xml (deflated 42%) 2022-08-17T14:07:10.5902733Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131600.xml (deflated 42%) 2022-08-17T14:07:10.5904038Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131605.xml (deflated 42%) 2022-08-17T14:07:10.5905271Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131611.xml (deflated 41%) 2022-08-17T14:07:10.5906060Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131616.xml (deflated 41%) 2022-08-17T14:07:10.5906829Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131622.xml (deflated 44%) 2022-08-17T14:07:10.5907615Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131627.xml (deflated 45%) 2022-08-17T14:07:10.5908398Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131633.xml (deflated 41%) 2022-08-17T14:07:10.5909179Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131638.xml (deflated 41%) 2022-08-17T14:07:10.5909948Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131646.xml (deflated 41%) 2022-08-17T14:07:10.5910731Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131651.xml (deflated 41%) 2022-08-17T14:07:10.5911511Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131657.xml (deflated 41%) 2022-08-17T14:07:10.5912298Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20220817131705.xml (deflated 41%) 2022-08-17T14:07:10.5913050Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131712.xml (deflated 39%) 2022-08-17T14:07:10.5913806Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131718.xml (deflated 39%) 2022-08-17T14:07:10.5914552Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131725.xml (deflated 39%) 2022-08-17T14:07:10.5915297Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131731.xml (deflated 39%) 2022-08-17T14:07:10.5916041Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131737.xml (deflated 39%) 2022-08-17T14:07:10.5916849Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131742.xml (deflated 39%) 2022-08-17T14:07:10.5917614Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131748.xml (deflated 39%) 2022-08-17T14:07:10.5918366Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131754.xml (deflated 40%) 2022-08-17T14:07:10.5919105Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131803.xml (deflated 40%) 2022-08-17T14:07:10.5919830Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131809.xml (deflated 39%) 2022-08-17T14:07:10.5920566Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131816.xml (deflated 39%) 2022-08-17T14:07:10.5921312Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131823.xml (deflated 39%) 2022-08-17T14:07:10.5922048Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131829.xml (deflated 40%) 2022-08-17T14:07:10.5922841Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131834.xml (deflated 39%) 2022-08-17T14:07:10.5923574Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131840.xml (deflated 39%) 2022-08-17T14:07:10.5924314Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131845.xml (deflated 39%) 2022-08-17T14:07:10.5925051Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131851.xml (deflated 40%) 2022-08-17T14:07:10.5925775Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131858.xml (deflated 40%) 2022-08-17T14:07:10.5926510Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131904.xml (deflated 40%) 2022-08-17T14:07:10.5927250Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131910.xml (deflated 40%) 2022-08-17T14:07:10.5927988Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131917.xml (deflated 40%) 2022-08-17T14:07:10.5928707Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131923.xml (deflated 40%) 2022-08-17T14:07:10.5929439Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131928.xml (deflated 39%) 2022-08-17T14:07:10.5930177Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131936.xml (deflated 39%) 2022-08-17T14:07:10.5930917Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131941.xml (deflated 40%) 2022-08-17T14:07:10.5931632Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131947.xml (deflated 39%) 2022-08-17T14:07:10.5932366Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817131954.xml (deflated 39%) 2022-08-17T14:07:10.5933098Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132000.xml (deflated 39%) 2022-08-17T14:07:10.5933835Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132006.xml (deflated 39%) 2022-08-17T14:07:10.5934554Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132013.xml (deflated 39%) 2022-08-17T14:07:10.5935352Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132019.xml (deflated 39%) 2022-08-17T14:07:10.5936102Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132024.xml (deflated 39%) 2022-08-17T14:07:10.5936842Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132030.xml (deflated 40%) 2022-08-17T14:07:10.5937560Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132040.xml (deflated 39%) 2022-08-17T14:07:10.5938289Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132045.xml (deflated 39%) 2022-08-17T14:07:10.5939024Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132051.xml (deflated 39%) 2022-08-17T14:07:10.5939754Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132058.xml (deflated 39%) 2022-08-17T14:07:10.5940476Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132103.xml (deflated 40%) 2022-08-17T14:07:10.5941213Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132109.xml (deflated 39%) 2022-08-17T14:07:10.5942026Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132117.xml (deflated 39%) 2022-08-17T14:07:10.5942761Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132123.xml (deflated 40%) 2022-08-17T14:07:10.5943702Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132129.xml (deflated 39%) 2022-08-17T14:07:10.5944454Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132135.xml (deflated 39%) 2022-08-17T14:07:10.5945201Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132142.xml (deflated 39%) 2022-08-17T14:07:10.5945934Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132147.xml (deflated 39%) 2022-08-17T14:07:10.5946660Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132154.xml (deflated 40%) 2022-08-17T14:07:10.5947395Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132156.xml (deflated 39%) 2022-08-17T14:07:10.5948132Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132201.xml (deflated 41%) 2022-08-17T14:07:10.5948868Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132203.xml (deflated 40%) 2022-08-17T14:07:10.5949592Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20220817132210.xml (deflated 39%) 2022-08-17T14:07:10.5950298Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20220817132216.xml (deflated 39%) 2022-08-17T14:07:10.5950984Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20220817132218.xml (deflated 39%) 2022-08-17T14:07:10.5951669Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20220817132220.xml (deflated 39%) 2022-08-17T14:07:10.5952333Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20220817132221.xml (deflated 39%) 2022-08-17T14:07:10.5953006Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20220817132223.xml (deflated 38%) 2022-08-17T14:07:10.5953683Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20220817132225.xml (deflated 39%) 2022-08-17T14:07:10.5954460Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-RendezvousEnvTest-20220817132227.xml (deflated 39%) 2022-08-17T14:07:10.5955152Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-TimeoutTest-20220817132230.xml (deflated 41%) 2022-08-17T14:07:10.5955850Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestHooks-20220817132234.xml (deflated 80%) 2022-08-17T14:07:10.5956547Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestNoGrad-20220817132234.xml (deflated 64%) 2022-08-17T14:07:10.5957265Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParamInit-20220817132234.xml (deflated 61%) 2022-08-17T14:07:10.5957984Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParityWithDDP-20220817132234.xml (deflated 91%) 2022-08-17T14:07:10.5958866Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_mixed_precision/TEST-TestFSDPMixedPrecisionSharded-20220817132807.xml (deflated 92%) 2022-08-17T14:07:10.5959761Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_mixed_precision/TEST-TestFSDPMixedPrecisionUnsharded-20220817132807.xml (deflated 56%) 2022-08-17T14:07:10.5960615Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_summon_full_params/TEST-TestSummonFullParams-20220817133241.xml (deflated 93%) 2022-08-17T14:07:10.5961555Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_summon_full_params/TEST-TestSummonFullParamsNoShard-20220817133241.xml (deflated 84%) 2022-08-17T14:07:10.5962353Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_state_dict/TEST-TestFSDPStateDict-20220817133702.xml (deflated 94%) 2022-08-17T14:07:10.5963248Z adding: test/test-reports/python-unittest/distributed.optim.test_zero_redundancy_optimizer/TEST-TestZeroRedundancyOptimizerDistributed-20220817134055.xml (deflated 90%) 2022-08-17T14:07:10.5964237Z adding: test/test-reports/python-unittest/distributed.optim.test_zero_redundancy_optimizer/TEST-TestZeroRedundancyOptimizerSingleRank-20220817134055.xml (deflated 73%) 2022-08-17T14:07:10.5965103Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_optim_state/TEST-TestFSDPOptimState-20220817134336.xml (deflated 91%) 2022-08-17T14:07:10.5965937Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestCreateTensorFromParams-20220817134603.xml (deflated 43%) 2022-08-17T14:07:10.5966817Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorMetadata-20220817134603.xml (deflated 43%) 2022-08-17T14:07:10.5967652Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestLocalTensor-20220817134603.xml (deflated 60%) 2022-08-17T14:07:10.5968457Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestModuleHookApi-20220817134603.xml (deflated 58%) 2022-08-17T14:07:10.5969273Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardMetadata-20220817134603.xml (deflated 58%) 2022-08-17T14:07:10.5970065Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardParameter-20220817134603.xml (deflated 60%) 2022-08-17T14:07:10.5970875Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardTensor-20220817134603.xml (deflated 60%) 2022-08-17T14:07:10.5971704Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorChunked-20220817134603.xml (deflated 88%) 2022-08-17T14:07:10.5972585Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorCustomOps-20220817134603.xml (deflated 69%) 2022-08-17T14:07:10.5973497Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorEnumerable-20220817134603.xml (deflated 87%) 2022-08-17T14:07:10.5974447Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorFromLocalShards-20220817134603.xml (deflated 82%) 2022-08-17T14:07:10.5975382Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorFromLocalTensor-20220817134603.xml (deflated 61%) 2022-08-17T14:07:10.5976226Z adding: test/test-reports/python-unittest/distributed.test_c10d_pypg/TEST-TestDDPWithWorkSubclass-20220817134812.xml (deflated 84%) 2022-08-17T14:07:10.5976980Z adding: test/test-reports/python-unittest/distributed.test_c10d_pypg/TEST-TestDDPWithWorkWrapper-20220817134812.xml (deflated 84%) 2022-08-17T14:07:10.5977722Z adding: test/test-reports/python-unittest/distributed.fsdp.test_wrap/TEST-TestAutoWrap-20220817135019.xml (deflated 81%) 2022-08-17T14:07:10.5978435Z adding: test/test-reports/python-unittest/distributed.fsdp.test_wrap/TEST-TestFSDPWrap-20220817135019.xml (deflated 89%) 2022-08-17T14:07:10.5979185Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_clip_grad_norm/TEST-TestCalcuGradNorm-20220817135150.xml (deflated 84%) 2022-08-17T14:07:10.5980013Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_clip_grad_norm/TEST-TestClipGradNorm-20220817135150.xml (deflated 86%) 2022-08-17T14:07:10.5980816Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135454.xml (deflated 40%) 2022-08-17T14:07:10.5981634Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135500.xml (deflated 41%) 2022-08-17T14:07:10.5982446Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135506.xml (deflated 40%) 2022-08-17T14:07:10.5983544Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135511.xml (deflated 40%) 2022-08-17T14:07:10.5984844Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135517.xml (deflated 40%) 2022-08-17T14:07:10.5985672Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135523.xml (deflated 41%) 2022-08-17T14:07:10.5986479Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135528.xml (deflated 41%) 2022-08-17T14:07:10.5987286Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135534.xml (deflated 40%) 2022-08-17T14:07:10.5988071Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20220817135539.xml (deflated 40%) 2022-08-17T14:07:10.5988885Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20220817135545.xml (deflated 39%) 2022-08-17T14:07:10.5989688Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20220817135550.xml (deflated 39%) 2022-08-17T14:07:10.5990493Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20220817135557.xml (deflated 39%) 2022-08-17T14:07:10.5991277Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20220817135604.xml (deflated 39%) 2022-08-17T14:07:10.5992090Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20220817135612.xml (deflated 39%) 2022-08-17T14:07:10.5992851Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_misc/TEST-TestFSDPMisc-20220817135621.xml (deflated 77%) 2022-08-17T14:07:10.5993710Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_comm_hooks/TEST-TestCommunicationHooks-20220817135718.xml (deflated 87%) 2022-08-17T14:07:10.5994543Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135814.xml (deflated 42%) 2022-08-17T14:07:10.5995389Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135822.xml (deflated 41%) 2022-08-17T14:07:10.5996233Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135830.xml (deflated 42%) 2022-08-17T14:07:10.5997062Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135838.xml (deflated 42%) 2022-08-17T14:07:10.5997877Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135847.xml (deflated 42%) 2022-08-17T14:07:10.5998711Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135855.xml (deflated 42%) 2022-08-17T14:07:10.5999547Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135903.xml (deflated 42%) 2022-08-17T14:07:10.6000469Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135911.xml (deflated 42%) 2022-08-17T14:07:10.6001285Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220817135919.xml (deflated 41%) 2022-08-17T14:07:10.6002106Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_freezing_weights/TEST-TestFreezingWeights-20220817135926.xml (deflated 84%) 2022-08-17T14:07:10.6002881Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_comm/TEST-TestCommunication-20220817140009.xml (deflated 91%) 2022-08-17T14:07:10.6003652Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_exec_order/TEST-TestFSDPExecOrder-20220817140046.xml (deflated 82%) 2022-08-17T14:07:10.6004518Z adding: test/test-reports/python-unittest/distributed.algorithms.ddp_comm_hooks.test_ddp_hooks/TEST-DistributedDataParallelCommHookTest-20220817140121.xml (deflated 79%) 2022-08-17T14:07:10.6005393Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_meta/TEST-TestFSDPWithMetaDevice-20220817140153.xml (deflated 86%) 2022-08-17T14:07:10.6006205Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_ignored_modules/TEST-TestFSDPIgnoredModules-20220817140223.xml (deflated 75%) 2022-08-17T14:07:10.6007084Z adding: test/test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint/TEST-TestDistributedReshardOnLoad-20220817140246.xml (deflated 68%) 2022-08-17T14:07:10.6008024Z adding: test/test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint/TEST-TestDistributedStateDictSaveLoad-20220817140246.xml (deflated 43%) 2022-08-17T14:07:10.6009041Z adding: test/test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint/TEST-TestDistributedStateDictSaveLoadWithSharedTensor-20220817140246.xml (deflated 44%) 2022-08-17T14:07:10.6010022Z adding: test/test-reports/python-unittest/distributed._shard.checkpoint.test_checkpoint/TEST-TestDistributedCheckpointing-20220817140308.xml (deflated 55%) 2022-08-17T14:07:10.6010885Z adding: test/test-reports/python-unittest/distributed._shard.checkpoint.test_checkpoint/TEST-TestDistributedFailure-20220817140308.xml (deflated 75%) 2022-08-17T14:07:10.6011697Z adding: test/test-reports/python-unittest/distributed._shard.checkpoint.test_checkpoint/TEST-TestStorageKeys-20220817140308.xml (deflated 40%) 2022-08-17T14:07:10.6012475Z adding: test/test-reports/python-unittest/distributed.test_c10d_object_collectives/TEST-TestObjectCollectives-20220817140329.xml (deflated 69%) 2022-08-17T14:07:10.6013423Z adding: test/test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint_cpu/TEST-TestDistributedReshardOnLoad-20220817140346.xml (deflated 68%) 2022-08-17T14:07:10.6014387Z adding: test/test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint_cpu/TEST-TestDistributedStateDictSaveLoad-20220817140346.xml (deflated 42%) 2022-08-17T14:07:10.6015539Z adding: test/test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint_cpu/TEST-TestDistributedStateDictSaveLoadWithSharedTensor-20220817140346.xml (deflated 44%) 2022-08-17T14:07:10.6017346Z adding: test/test-reports/python-unittest/distributed._shard.sharding_plan.test_sharding_plan/TEST-TestShardingPlan-20220817140400.xml (deflated 75%) 2022-08-17T14:07:10.6019157Z adding: test/test-reports/python-unittest/distributed._shard.test_partial_tensor/TEST-TestPartialTensorOps-20220817140413.xml (deflated 67%) 2022-08-17T14:07:10.6020772Z adding: test/test-reports/python-unittest/distributed._shard.test_partial_tensor/TEST-TestPartialTensorReshard-20220817140413.xml (deflated 61%) 2022-08-17T14:07:10.6022481Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_binary_cmp/TEST-TestShardedTensorBinaryOps-20220817140427.xml (deflated 74%) 2022-08-17T14:07:10.6024616Z adding: test/test-reports/python-unittest/distributed.fsdp.test_distributed_checkpoint/TEST-TestDistributedCheckpoint-20220817140438.xml (deflated 59%) 2022-08-17T14:07:10.6026640Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_init/TEST-TestShardedTensorNNInit-20220817140449.xml (deflated 69%) 2022-08-17T14:07:10.6027675Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor_reshard/TEST-TestReshard-20220817140458.xml (deflated 61%) 2022-08-17T14:07:10.6028488Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_multiple_forward/TEST-TestMultiForward-20220817140506.xml (deflated 41%) 2022-08-17T14:07:10.6029266Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_pure_fp16/TEST-TestPureFP16-20220817140513.xml (deflated 56%) 2022-08-17T14:07:10.6030067Z adding: test/test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-LocalTimerServerTest-20220817140521.xml (deflated 70%) 2022-08-17T14:07:10.6030893Z adding: test/test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-LocalTimerTest-20220817140521.xml (deflated 69%) 2022-08-17T14:07:10.6031749Z adding: test/test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-MultiprocessingRequestQueueTest-20220817140521.xml (deflated 66%) 2022-08-17T14:07:10.6032599Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_uneven/TEST-TestUnevenParamShard-20220817140528.xml (deflated 41%) 2022-08-17T14:07:10.6033363Z adding: test/test-reports/python-unittest/distributed.test_data_parallel/TEST-TestDataParallel-20220817140537.xml (deflated 83%) 2022-08-17T14:07:10.6034180Z adding: test/test-reports/python-unittest/distributed.test_data_parallel/TEST-TestDataParallelDeviceTypeCUDA-20220817140537.xml (deflated 90%) 2022-08-17T14:07:10.6035016Z adding: test/test-reports/python-unittest/distributed.elastic.utils.distributed_test/TEST-DistributedUtilTest-20220817140544.xml (deflated 71%) 2022-08-17T14:07:10.6035937Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_megatron_prototype/TEST-TestShardedTensorMegatronLinear-20220817140551.xml (deflated 43%) 2022-08-17T14:07:10.6036778Z adding: test/test-reports/python-unittest/distributed.elastic.utils.util_test/TEST-StoreUtilTest-20220817140556.xml (deflated 62%) 2022-08-17T14:07:10.6037521Z adding: test/test-reports/python-unittest/distributed.elastic.utils.util_test/TEST-UtilTest-20220817140556.xml (deflated 69%) 2022-08-17T14:07:10.6038294Z adding: test/test-reports/python-unittest/distributed.fsdp.test_checkpoint_wrapper/TEST-CheckpointWrapperTest-20220817140600.xml (deflated 64%) 2022-08-17T14:07:10.6039233Z adding: test/test-reports/python-unittest/distributed._shard.checkpoint.test_utils/TEST-TestMedatadaIndex-20220817140604.xml (deflated 72%) 2022-08-17T14:07:10.6040044Z adding: test/test-reports/python-unittest/distributed.elastic.utils.logging_test/TEST-LoggingTest-20220817140607.xml (deflated 54%) 2022-08-17T14:07:10.6040819Z adding: test/test-reports/python-unittest/distributed.test_launcher/TEST-TestDistributedLaunch-20220817140611.xml (deflated 42%) 2022-08-17T14:07:10.6041650Z adding: test/test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135301.xml (deflated 43%) 2022-08-17T14:07:10.6042542Z adding: test/test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135303.xml (deflated 44%) 2022-08-17T14:07:10.6043430Z adding: test/test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135305.xml (deflated 42%) 2022-08-17T14:07:10.6044319Z adding: test/test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135312.xml (deflated 42%) 2022-08-17T14:07:10.6045183Z adding: test/test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135320.xml (deflated 42%) 2022-08-17T14:07:10.6046150Z adding: test/test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135328.xml (deflated 42%) 2022-08-17T14:07:10.6047027Z adding: test/test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135338.xml (deflated 43%) 2022-08-17T14:07:10.6047908Z adding: test/test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135340.xml (deflated 43%) 2022-08-17T14:07:10.6048791Z adding: test/test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135342.xml (deflated 41%) 2022-08-17T14:07:10.6049653Z adding: test/test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135350.xml (deflated 42%) 2022-08-17T14:07:10.6050531Z adding: test/test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135358.xml (deflated 41%) 2022-08-17T14:07:10.6051412Z adding: test/test-reports/dist-nccl/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135405.xml (deflated 42%) 2022-08-17T14:07:10.6052301Z adding: test/test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135415.xml (deflated 41%) 2022-08-17T14:07:10.6053163Z adding: test/test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135420.xml (deflated 41%) 2022-08-17T14:07:10.6054044Z adding: test/test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135426.xml (deflated 43%) 2022-08-17T14:07:10.6054922Z adding: test/test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135428.xml (deflated 44%) 2022-08-17T14:07:10.6055803Z adding: test/test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135429.xml (deflated 44%) 2022-08-17T14:07:10.6056662Z adding: test/test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135431.xml (deflated 44%) 2022-08-17T14:07:10.6057537Z adding: test/test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135435.xml (deflated 41%) 2022-08-17T14:07:10.6058455Z adding: test/test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135440.xml (deflated 41%) 2022-08-17T14:07:10.6059341Z adding: test/test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135445.xml (deflated 44%) 2022-08-17T14:07:10.6060193Z adding: test/test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135447.xml (deflated 44%) 2022-08-17T14:07:10.6061070Z adding: test/test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135449.xml (deflated 44%) 2022-08-17T14:07:10.6061941Z adding: test/test-reports/dist-gloo/distributed.algorithms.quantization.test_quantization/TEST-DistQuantizationTests-20220817135451.xml (deflated 44%) 2022-08-17T14:07:10.6097134Z ##[group]Run # Remove any previous test reports if they exist 2022-08-17T14:07:10.6097526Z # Remove any previous test reports if they exist 2022-08-17T14:07:10.6097841Z rm -f usage-log-*.zip 2022-08-17T14:07:10.6098204Z # this workflow is also run in bazel build test, but we dont generate usage reports for it 2022-08-17T14:07:10.6098597Z # so check to see if the file exists first 2022-08-17T14:07:10.6098903Z if [ -f 'usage_log.txt' ]; then 2022-08-17T14:07:10.6099330Z  zip "usage-log-${FILE_SUFFIX}.zip" 'usage_log.txt' 2022-08-17T14:07:10.6099604Z fi 2022-08-17T14:07:10.6111768Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-08-17T14:07:10.6112062Z env: 2022-08-17T14:07:10.6112302Z GIT_DEFAULT_BRANCH: master 2022-08-17T14:07:10.6112552Z GPU_FLAG: --gpus all 2022-08-17T14:07:10.6112922Z FILE_SUFFIX: test-distributed-2-2-linux.8xlarge.nvidia.gpu_7878561046 2022-08-17T14:07:10.6113274Z ##[endgroup] 2022-08-17T14:07:10.6880837Z adding: usage_log.txt (deflated 94%) 2022-08-17T14:07:10.6926773Z ##[group]Run seemethere/upload-artifact-s3@v5 2022-08-17T14:07:10.6927068Z with: 2022-08-17T14:07:10.6927349Z s3-prefix: pytorch/pytorch/2875102080/1/artifact 2022-08-17T14:07:10.6927629Z retention-days: 14 2022-08-17T14:07:10.6927897Z if-no-files-found: warn 2022-08-17T14:07:10.6928170Z path: test-jsons-*.zip 2022-08-17T14:07:10.6928420Z name: artifact 2022-08-17T14:07:10.6928653Z s3-bucket: gha-artifacts 2022-08-17T14:07:10.6928923Z region: us-east-1 2022-08-17T14:07:10.6929153Z env: 2022-08-17T14:07:10.6929372Z GIT_DEFAULT_BRANCH: master 2022-08-17T14:07:10.6929635Z GPU_FLAG: --gpus all 2022-08-17T14:07:10.6929877Z ##[endgroup] 2022-08-17T14:07:11.0894965Z NOTE: s3-prefix specified, ignoring name parameter 2022-08-17T14:07:11.0895348Z With the provided path, there will be 1 file uploaded 2022-08-17T14:07:11.0895722Z Uploading to s3 prefix: pytorch/pytorch/2875102080/1/artifact 2022-08-17T14:07:11.0906422Z Starting upload of test-jsons-test-distributed-2-2-linux.8xlarge.nvidia.gpu_7878561046.zip 2022-08-17T14:07:11.2439984Z Finished upload of test-jsons-test-distributed-2-2-linux.8xlarge.nvidia.gpu_7878561046.zip 2022-08-17T14:07:11.2574580Z ##[group]Run seemethere/upload-artifact-s3@v5 2022-08-17T14:07:11.2574875Z with: 2022-08-17T14:07:11.2575157Z s3-prefix: pytorch/pytorch/2875102080/1/artifact 2022-08-17T14:07:11.2575443Z retention-days: 14 2022-08-17T14:07:11.2575716Z if-no-files-found: error 2022-08-17T14:07:11.2576009Z path: test-reports-*.zip 2022-08-17T14:07:11.2576253Z name: artifact 2022-08-17T14:07:11.2576506Z s3-bucket: gha-artifacts 2022-08-17T14:07:11.2576772Z region: us-east-1 2022-08-17T14:07:11.2577003Z env: 2022-08-17T14:07:11.2577227Z GIT_DEFAULT_BRANCH: master 2022-08-17T14:07:11.2577492Z GPU_FLAG: --gpus all 2022-08-17T14:07:11.2577744Z ##[endgroup] 2022-08-17T14:07:11.6544035Z NOTE: s3-prefix specified, ignoring name parameter 2022-08-17T14:07:11.6544436Z With the provided path, there will be 1 file uploaded 2022-08-17T14:07:11.6544797Z Uploading to s3 prefix: pytorch/pytorch/2875102080/1/artifact 2022-08-17T14:07:11.6555360Z Starting upload of test-reports-test-distributed-2-2-linux.8xlarge.nvidia.gpu_7878561046.zip 2022-08-17T14:07:11.8497497Z Finished upload of test-reports-test-distributed-2-2-linux.8xlarge.nvidia.gpu_7878561046.zip 2022-08-17T14:07:11.8655166Z ##[group]Run seemethere/upload-artifact-s3@v5 2022-08-17T14:07:11.8655464Z with: 2022-08-17T14:07:11.8655760Z s3-prefix: pytorch/pytorch/2875102080/1/artifact 2022-08-17T14:07:11.8656040Z retention-days: 14 2022-08-17T14:07:11.8656309Z if-no-files-found: ignore 2022-08-17T14:07:11.8656585Z path: usage-log-*.zip 2022-08-17T14:07:11.8656819Z name: artifact 2022-08-17T14:07:11.8657073Z s3-bucket: gha-artifacts 2022-08-17T14:07:11.8657336Z region: us-east-1 2022-08-17T14:07:11.8657569Z env: 2022-08-17T14:07:11.8657791Z GIT_DEFAULT_BRANCH: master 2022-08-17T14:07:11.8658054Z GPU_FLAG: --gpus all 2022-08-17T14:07:11.8658301Z ##[endgroup] 2022-08-17T14:07:12.2601022Z NOTE: s3-prefix specified, ignoring name parameter 2022-08-17T14:07:12.2601434Z With the provided path, there will be 1 file uploaded 2022-08-17T14:07:12.2601831Z Uploading to s3 prefix: pytorch/pytorch/2875102080/1/artifact 2022-08-17T14:07:12.2611460Z Starting upload of usage-log-test-distributed-2-2-linux.8xlarge.nvidia.gpu_7878561046.zip 2022-08-17T14:07:12.4669953Z Finished upload of usage-log-test-distributed-2-2-linux.8xlarge.nvidia.gpu_7878561046.zip 2022-08-17T14:07:12.4808674Z ##[group]Run set -x 2022-08-17T14:07:12.4808965Z set -x 2022-08-17T14:07:12.4809269Z python3 -m pip install -r requirements.txt 2022-08-17T14:07:12.4809612Z python3 -m pip install boto3==1.19.12 2022-08-17T14:07:12.4810009Z python3 -m tools.stats.print_test_stats --upload-to-s3 --compare-with-s3 test 2022-08-17T14:07:12.4824036Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-08-17T14:07:12.4824340Z env: 2022-08-17T14:07:12.4824583Z GIT_DEFAULT_BRANCH: master 2022-08-17T14:07:12.4824833Z GPU_FLAG: --gpus all 2022-08-17T14:07:12.4825104Z AWS_DEFAULT_REGION: us-east-1 2022-08-17T14:07:12.4825370Z BRANCH: pull/82657 2022-08-17T14:07:12.4825614Z TEST_CONFIG: distributed 2022-08-17T14:07:12.4825869Z SHARD_NUMBER: 2 2022-08-17T14:07:12.4826184Z BUILD_ENVIRONMENT: linux-bionic-cuda11.6-py3.10-gcc7 2022-08-17T14:07:12.4826483Z PR_NUMBER: 82657 2022-08-17T14:07:12.4826753Z PYTORCH_RETRY_TEST_CASES: 1 2022-08-17T14:07:12.4827083Z PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1 2022-08-17T14:07:12.4827385Z SHA1: ce6a3c605df99d1df57c0dda75c06d748e54ed2a 2022-08-17T14:07:12.4827661Z TAG: 2022-08-17T14:07:12.4827887Z WORKFLOW_ID: 2875102080 2022-08-17T14:07:12.4828317Z GITHUB_TOKEN: *** 2022-08-17T14:07:12.4828567Z GHA_WORKFLOW_JOB_ID: 7878561046 2022-08-17T14:07:12.4828826Z ##[endgroup] 2022-08-17T14:07:12.4858666Z + python3 -m pip install -r requirements.txt 2022-08-17T14:07:12.7758674Z Defaulting to user installation because normal site-packages is not writeable 2022-08-17T14:07:12.8554359Z Collecting astunparse 2022-08-17T14:07:12.8730667Z Downloading astunparse-1.6.3-py2.py3-none-any.whl (12 kB) 2022-08-17T14:07:12.9059343Z Collecting expecttest 2022-08-17T14:07:12.9105856Z Downloading expecttest-0.1.3-py3-none-any.whl (6.5 kB) 2022-08-17T14:07:12.9510331Z Collecting future 2022-08-17T14:07:12.9559569Z Downloading future-0.18.2.tar.gz (829 kB) 2022-08-17T14:07:14.2702543Z Collecting numpy 2022-08-17T14:07:14.2768289Z Downloading numpy-1.21.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB) 2022-08-17T14:07:14.6282191Z Requirement already satisfied: psutil in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 6)) (5.9.1) 2022-08-17T14:07:14.7501770Z Collecting pyyaml 2022-08-17T14:07:14.7611095Z Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB) 2022-08-17T14:07:14.7866046Z Requirement already satisfied: requests in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 8)) (2.26.0) 2022-08-17T14:07:14.8039503Z Requirement already satisfied: setuptools in /usr/lib/python3.7/site-packages (from -r requirements.txt (line 9)) (49.1.3) 2022-08-17T14:07:14.8663064Z Collecting six 2022-08-17T14:07:14.8708873Z Downloading six-1.16.0-py2.py3-none-any.whl (11 kB) 2022-08-17T14:07:14.9049577Z Collecting types-dataclasses 2022-08-17T14:07:14.9094993Z Downloading types_dataclasses-0.6.6-py3-none-any.whl (2.9 kB) 2022-08-17T14:07:14.9526789Z Collecting typing_extensions 2022-08-17T14:07:14.9572953Z Downloading typing_extensions-4.3.0-py3-none-any.whl (25 kB) 2022-08-17T14:07:15.0130040Z Collecting sympy 2022-08-17T14:07:15.0191107Z Downloading sympy-1.10.1-py3-none-any.whl (6.4 MB) 2022-08-17T14:07:15.2520259Z Collecting wheel<1.0,>=0.23.0 2022-08-17T14:07:15.2563922Z Downloading wheel-0.37.1-py2.py3-none-any.whl (35 kB) 2022-08-17T14:07:15.2690744Z Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 8)) (1.26.11) 2022-08-17T14:07:15.2909194Z Requirement already satisfied: certifi>=2017.4.17 in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 8)) (2022.6.15) 2022-08-17T14:07:15.2921169Z Requirement already satisfied: charset-normalizer~=2.0.0; python_version >= "3" in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 8)) (2.0.12) 2022-08-17T14:07:15.2949177Z Requirement already satisfied: idna<4,>=2.5; python_version >= "3" in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 8)) (3.3) 2022-08-17T14:07:15.3222630Z Collecting mpmath>=0.19 2022-08-17T14:07:15.3297266Z Downloading mpmath-1.2.1-py3-none-any.whl (532 kB) 2022-08-17T14:07:15.3573164Z Using legacy 'setup.py install' for future, since package 'wheel' is not installed. 2022-08-17T14:07:15.4072488Z Installing collected packages: six, wheel, astunparse, expecttest, future, numpy, pyyaml, types-dataclasses, typing-extensions, mpmath, sympy 2022-08-17T14:07:15.4492222Z WARNING: The script wheel is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-08-17T14:07:15.4492850Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-08-17T14:07:15.4815645Z Running setup.py install for future: started 2022-08-17T14:07:16.1639243Z Running setup.py install for future: finished with status 'done' 2022-08-17T14:07:18.1991589Z WARNING: The scripts f2py, f2py3 and f2py3.7 are installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-08-17T14:07:18.1992263Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-08-17T14:07:27.2308961Z WARNING: The script isympy is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-08-17T14:07:27.2309618Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-08-17T14:07:27.2639543Z Successfully installed astunparse-1.6.3 expecttest-0.1.3 future-0.18.2 mpmath-1.2.1 numpy-1.21.6 pyyaml-6.0 six-1.16.0 sympy-1.10.1 types-dataclasses-0.6.6 typing-extensions-4.3.0 wheel-0.37.1 2022-08-17T14:07:27.3356522Z + python3 -m pip install boto3==1.19.12 2022-08-17T14:07:27.6248474Z Defaulting to user installation because normal site-packages is not writeable 2022-08-17T14:07:28.5596056Z Collecting boto3==1.19.12 2022-08-17T14:07:28.5813145Z Downloading boto3-1.19.12-py3-none-any.whl (131 kB) 2022-08-17T14:07:29.6844888Z Collecting botocore<1.23.0,>=1.22.12 2022-08-17T14:07:29.6906938Z Downloading botocore-1.22.12-py3-none-any.whl (8.1 MB) 2022-08-17T14:07:29.9356014Z Collecting jmespath<1.0.0,>=0.7.1 2022-08-17T14:07:29.9400998Z Downloading jmespath-0.10.0-py2.py3-none-any.whl (24 kB) 2022-08-17T14:07:29.9902704Z Collecting s3transfer<0.6.0,>=0.5.0 2022-08-17T14:07:29.9944959Z Downloading s3transfer-0.5.2-py3-none-any.whl (79 kB) 2022-08-17T14:07:30.0101674Z Requirement already satisfied: urllib3<1.27,>=1.25.4 in /home/ec2-user/.local/lib/python3.7/site-packages (from botocore<1.23.0,>=1.22.12->boto3==1.19.12) (1.26.11) 2022-08-17T14:07:30.0795422Z Collecting python-dateutil<3.0.0,>=2.1 2022-08-17T14:07:30.0848033Z Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB) 2022-08-17T14:07:30.1035029Z Requirement already satisfied: six>=1.5 in /home/ec2-user/.local/lib/python3.7/site-packages (from python-dateutil<3.0.0,>=2.1->botocore<1.23.0,>=1.22.12->boto3==1.19.12) (1.16.0) 2022-08-17T14:07:30.1966935Z Installing collected packages: jmespath, python-dateutil, botocore, s3transfer, boto3 2022-08-17T14:07:31.1107674Z Successfully installed boto3-1.19.12 botocore-1.22.12 jmespath-0.10.0 python-dateutil-2.8.2 s3transfer-0.5.2 2022-08-17T14:07:31.1620856Z + python3 -m tools.stats.print_test_stats --upload-to-s3 --compare-with-s3 test 2022-08-17T14:07:41.3824391Z [scribe] Scribe access token not provided, sending report via boto3... 2022-08-17T14:07:41.3824662Z 2022-08-17T14:07:41.3829221Z ----- Historic stats comparison result ------ 2022-08-17T14:07:41.3829478Z 2022-08-17T14:07:41.3829747Z job: linux-bionic-cuda11.6-py3.10-gcc7 2022-08-17T14:07:41.3830123Z commit: ce6a3c605df99d1df57c0dda75c06d748e54ed2a 2022-08-17T14:07:41.3830338Z 2022-08-17T14:07:41.3830966Z Commit graph (base is most recent master ancestor with at least one S3 report): 2022-08-17T14:07:41.3831355Z 2022-08-17T14:07:41.3831443Z : (master) 2022-08-17T14:07:41.3831673Z | 2022-08-17T14:07:41.3834933Z | * ce6a3c605d (HEAD) total time 4247.69s 2022-08-17T14:07:41.3835214Z | | 2022-08-17T14:07:41.3835454Z | : (3 commits) 2022-08-17T14:07:41.3835690Z |/| 2022-08-17T14:07:41.3835893Z | : (6 commits) 2022-08-17T14:07:41.3836115Z | 2022-08-17T14:07:41.3836890Z * 343b5f8651 (base) 7 reports, total time 3509.45s ± 1268.64s 2022-08-17T14:07:41.3837337Z * 1a09b05c94 7 reports, total time 3542.76s ± 1272.47s 2022-08-17T14:07:41.3837778Z * df62ea76d1 7 reports, total time 3610.62s ± 1354.32s 2022-08-17T14:07:41.3838213Z * aac622ad55 7 reports, total time 3610.05s ± 1375.49s 2022-08-17T14:07:41.3838630Z * 31d4b6f52a 7 reports, total time 3607.72s ± 1303.31s 2022-08-17T14:07:41.3839062Z * 0e2efaf9cc 7 reports, total time 3618.15s ± 1377.87s 2022-08-17T14:07:41.3839559Z * 1ee9eb52b6 7 reports, total time 3593.26s ± 1339.91s 2022-08-17T14:07:41.3839880Z * f4b7c10e14 0 reports 2022-08-17T14:07:41.3840257Z * 785f7f6298 7 reports, total time 3597.37s ± 1338.01s 2022-08-17T14:07:41.3840675Z * 3586af8adc 7 reports, total time 3614.99s ± 1362.93s 2022-08-17T14:07:41.3840955Z | 2022-08-17T14:07:41.3841150Z : 2022-08-17T14:07:41.3841290Z 2022-08-17T14:07:41.3841464Z Removed (across 776 suites) 0 tests, totaling 0.00s 2022-08-17T14:07:41.3841818Z Modified (across 0 suites) 0 tests, totaling 0.00s 2022-08-17T14:07:41.3842154Z Added (across 89 suites) 1051 tests, totaling +4247.69s 2022-08-17T14:07:41.4413593Z Prepare all required actions 2022-08-17T14:07:41.4440706Z ##[group]Run ./.github/actions/teardown-linux 2022-08-17T14:07:41.4441000Z with: 2022-08-17T14:07:41.4441205Z env: 2022-08-17T14:07:41.4441453Z GIT_DEFAULT_BRANCH: master 2022-08-17T14:07:41.4441745Z GPU_FLAG: --gpus all 2022-08-17T14:07:41.4441987Z ##[endgroup] 2022-08-17T14:07:41.4483355Z ##[group]Run .github/scripts/wait_for_ssh_to_drain.sh 2022-08-17T14:07:41.4483696Z .github/scripts/wait_for_ssh_to_drain.sh 2022-08-17T14:07:41.4497248Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-08-17T14:07:41.4497545Z env: 2022-08-17T14:07:41.4497769Z GIT_DEFAULT_BRANCH: master 2022-08-17T14:07:41.4498040Z GPU_FLAG: --gpus all 2022-08-17T14:07:41.4498289Z ##[endgroup] 2022-08-17T14:07:41.4544080Z Holding runner for 2 hours until all ssh sessions have logged out 2022-08-17T14:07:41.4615740Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2022-08-17T14:07:41.4616203Z # ignore expansion of "docker ps -q" since it could be empty 2022-08-17T14:07:41.4616537Z # shellcheck disable=SC2046 2022-08-17T14:07:41.4616848Z docker stop $(docker ps -q) || true 2022-08-17T14:07:41.4617162Z # Prune all of the docker images 2022-08-17T14:07:41.4617447Z docker system prune -af 2022-08-17T14:07:41.4629898Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-08-17T14:07:41.4630201Z env: 2022-08-17T14:07:41.4630431Z GIT_DEFAULT_BRANCH: master 2022-08-17T14:07:41.4630702Z GPU_FLAG: --gpus all 2022-08-17T14:07:41.4630957Z ##[endgroup] 2022-08-17T14:07:43.2569845Z 20a8245f1146 2022-08-17T14:07:43.9723774Z Deleted Containers: 2022-08-17T14:07:43.9724186Z 20a8245f11468fcac7d3aeef57d0511f555b38bc9ada231a407ac49b94964c98 2022-08-17T14:07:43.9724450Z 2022-08-17T14:07:49.1967507Z Deleted Images: 2022-08-17T14:07:49.1968744Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:a347f7e7645f04fc68e4f87c73cf0385233153b8 2022-08-17T14:07:49.1969925Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7@sha256:490eaf5744f7cb16e853fb3063447df07750afa5d8ef966c98f5802528aaa7d2 2022-08-17T14:07:49.1970847Z deleted: sha256:9133c485eff1fb81f8324205e64d739258dc611669790c9c78be2dd3455adf77 2022-08-17T14:07:49.1971269Z deleted: sha256:08321466b65a03f253e5a7615d41e10b50035e920fe89e66a02c3cfc8d805f3d 2022-08-17T14:07:49.1971699Z deleted: sha256:6dce0210de025d8578f5babd7484aa8f1377934051f7db50c918b8fa8c101629 2022-08-17T14:07:49.1972102Z deleted: sha256:e09aa9749862fb755142c450d264dd994579829e912bc75ba19b7951963bfd4b 2022-08-17T14:07:49.1972522Z deleted: sha256:76bce255a88fa3ba245602b42d9487be9b35c5e5c775f10c95dc483031f2ced9 2022-08-17T14:07:49.1972950Z deleted: sha256:7a72485906e5db6ea49224c12b21807e4ba807fd4185810aa10933b449a1ee4b 2022-08-17T14:07:49.1973642Z deleted: sha256:cefbae7a195dc035182759e1db3bb92004dc3ef447806fb53e7e42f144936146 2022-08-17T14:07:49.1974088Z deleted: sha256:01c1445002474e4d86fc5c9517ca689731d2622b5b009c19cf24873e7290ca78 2022-08-17T14:07:49.1974511Z deleted: sha256:8b3b6997919dfb00f7a1d345edfbf5c8ee3de0436a1bb6718e4b4226312701cc 2022-08-17T14:07:49.1974935Z deleted: sha256:c448943ab49d88d1b852d6e91b63de8125676d94692391de0048c463dde7cd24 2022-08-17T14:07:49.1975350Z deleted: sha256:f7d3bbb11a3361204993dabec834573017baefbf66962761acf8de745ee23f29 2022-08-17T14:07:49.1975971Z deleted: sha256:260d0cd944f3afc1f4b6df77a6dfe292eaad55d07252d24a4d44f72bdb26bfca 2022-08-17T14:07:49.1976720Z deleted: sha256:73fada4ae2af631a60d536fb6b2e2752b8ad0f7d6e9a309b521af35327837101 2022-08-17T14:07:49.1977190Z deleted: sha256:cf71b29b3b5f8d54a4eaa0a46b2cf9f73c456eb3c4ad44ada5934f922815d85b 2022-08-17T14:07:49.1977623Z deleted: sha256:50c337bf1065c7c2db817361ab1b8f822581b1cd3e10d4cf2ad4011dd35ae808 2022-08-17T14:07:49.1978063Z deleted: sha256:ccbd83235d368df1b75f92641a91ccfacf12528e36411e2c064120d296fa27a3 2022-08-17T14:07:49.1978617Z deleted: sha256:028c1e96e96aa398e84a52df91d77e16d3cb663923f4d2a64e5f6629c92c9bde 2022-08-17T14:07:49.1979040Z deleted: sha256:25ea441ce3a35050634ecb4ce556bc5f302af49fa3891506c9ef9b1a6ccf83ad 2022-08-17T14:07:49.1979474Z deleted: sha256:b1c2c4d91e87f274f2e87e402cc3c80ae42050a88f5a46e24842ced371a62c19 2022-08-17T14:07:49.1979901Z deleted: sha256:27ce2970cb31b9050b8087a7346cf70f6c30f3815e1f34c111063d2a7d4222d5 2022-08-17T14:07:49.1980314Z deleted: sha256:633332114d1aa6b05f783865baaf35d975e78537113c04776eaa571d4fa94897 2022-08-17T14:07:49.1980701Z deleted: sha256:26c681992b758022f6f29e267e338c9484b62e0094f756ecc6a62acb7006f852 2022-08-17T14:07:49.1981112Z deleted: sha256:64ae17f41f441eede6b37a1970799bc455952e6a42587af8b7e0c6f77d4e1958 2022-08-17T14:07:49.1981536Z deleted: sha256:7308b942f2c5be9968ec9923693ce31093d803ad0d9f63879df18569df44a823 2022-08-17T14:07:49.1981932Z deleted: sha256:b3b80d941a1a002821cb9d54ca54480a6f208a0673fe0d47963888ee9ade79c5 2022-08-17T14:07:49.1982347Z deleted: sha256:853ebb38438204091b96dc6d6f591b51161c280a3e50ec17039b7767dc402757 2022-08-17T14:07:49.1982741Z deleted: sha256:b6426b0e449861081555183583ff79e8294e834662d433f52bb28bb553e75dcc 2022-08-17T14:07:49.1983136Z deleted: sha256:dd250b8239c18bf51b08c752513845460328d66233ac3430ff4b98aef4ced4c7 2022-08-17T14:07:49.1983906Z deleted: sha256:a7f4ac636e13bbb309775469e0addcf87fd260bb8f44731920831f7483f667bc 2022-08-17T14:07:49.1984378Z deleted: sha256:2f8ffa0d955ce734ecc4d94583cc8fa55972025f9c39fdb33f586ad3a71fb7bb 2022-08-17T14:07:49.1984815Z deleted: sha256:09f3b1f44b30430b1979cefed51350b0544808969bb04c8b77af43f469ef4d8a 2022-08-17T14:07:49.1985218Z deleted: sha256:a46cd87a0f3487787594ca74ade83178f1ea6169be4444f8f24417c3aa8aba83 2022-08-17T14:07:49.1985636Z deleted: sha256:725c99b07f8df059184999658da19223ae55b22b5cc826555f7400f8af00938d 2022-08-17T14:07:49.1986054Z deleted: sha256:2e9cce52f2d9bf0b67c1990b5304313c0d27166e04a6d34ef62dfe8fd48e1d02 2022-08-17T14:07:49.1986512Z deleted: sha256:8c0a79caeaf13c2f1bd84209cadde461a0ae7a83b09004a3f7d702c9274df097 2022-08-17T14:07:49.1986927Z deleted: sha256:18f05226ee66c3fbe66354682277e22a25e6d34de7bb390460c4d55558377cfb 2022-08-17T14:07:49.1987341Z deleted: sha256:22f0fcf09a4c92499f405be7463b88c52c53170f23c1dcbb3d2b0bd3630c63ad 2022-08-17T14:07:49.1987778Z deleted: sha256:d6e40ecc5aa7f4efd300de019085685810fb94cf9e93fa18121563c9fdb40757 2022-08-17T14:07:49.1988294Z deleted: sha256:93ce5053216be995b4be4da356365902839b567f8305291cf962f38e5289d5ab 2022-08-17T14:07:49.1988720Z deleted: sha256:a5bb28dbaf7948b250d6fef4626b6bbd9f4e234778a5c4875d743d1f4db13036 2022-08-17T14:07:49.1989167Z deleted: sha256:faafc73f1dc40b0eebd845398b532620a19f56b40494cc9915bc11f64ebe6126 2022-08-17T14:07:49.1989609Z deleted: sha256:9a176eb230ca0736a7f90acd1ce3dcac5c9ed759943748c26a2e049506402651 2022-08-17T14:07:49.1990039Z deleted: sha256:ecf42c8411d75ac738acf3d5f30f080c2313e0f96d0dc5cdc5dbad84b4b12218 2022-08-17T14:07:49.1990479Z deleted: sha256:0efd5eb7b29596b7c37f9c700e3223444045ffe4d46423b7d48a54ccae7c2558 2022-08-17T14:07:49.1990912Z deleted: sha256:129bdb873e79117f4e90135f0c6a58f775fcf596f4eb514b803771cef2da8278 2022-08-17T14:07:49.1991335Z deleted: sha256:2d49e3a81bd436bfd20fb4a849cdc98da82cb74afef3de38dda7a946d3fc4153 2022-08-17T14:07:49.1991793Z deleted: sha256:0ba4e259108e5311ddf6b79ae3a35f8f16a4004ef8817e50427baa3cc90ac081 2022-08-17T14:07:49.1992233Z deleted: sha256:c164403226561914f16becdeca65c54d20dba8dad414b062efc34c05c47bf725 2022-08-17T14:07:49.1992663Z deleted: sha256:cbe4006b2e6286d50c1b292fb71b69d5299d65f055285519eafc41eac3ef8a3c 2022-08-17T14:07:49.1993083Z deleted: sha256:edcec18dceb25f1a03ec20de4676464613e69072875a83f5c45e45a31aafc5b9 2022-08-17T14:07:49.1993506Z deleted: sha256:13c4f317ac4bb48997302756b8d5f8b602e835607c9806a1a5b200e9a0657d8a 2022-08-17T14:07:49.1993918Z deleted: sha256:57f043e380f4586c76968d6e062b50bac55254a5be7e80bea3c027a5bb316469 2022-08-17T14:07:49.1994310Z deleted: sha256:3e549931e0240b9aac25dc79ed6a6259863879a5c9bd20755f77cac27c1ab8c8 2022-08-17T14:07:49.1994551Z 2022-08-17T14:07:49.2076332Z Total reclaimed space: 19.25GB 2022-08-17T14:07:49.2137548Z Post job cleanup. 2022-08-17T14:07:49.2176125Z Post job cleanup. 2022-08-17T14:07:49.3497884Z [command]/usr/bin/git version 2022-08-17T14:07:49.3547156Z git version 2.37.1 2022-08-17T14:07:49.3611101Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/89e8c67e-5056-4d05-bd1d-641f052d8c48' before making global git config changes 2022-08-17T14:07:49.3611689Z Adding repository directory to the temporary git global config as a safe directory 2022-08-17T14:07:49.3620200Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-08-17T14:07:49.3667396Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2022-08-17T14:07:49.3703763Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || : 2022-08-17T14:07:49.4034274Z Entering 'android/libs/fbjni' 2022-08-17T14:07:49.4076687Z Entering 'third_party/FP16' 2022-08-17T14:07:49.4121407Z Entering 'third_party/FXdiv' 2022-08-17T14:07:49.4162377Z Entering 'third_party/NNPACK' 2022-08-17T14:07:49.4205693Z Entering 'third_party/QNNPACK' 2022-08-17T14:07:49.4247810Z Entering 'third_party/XNNPACK' 2022-08-17T14:07:49.4301200Z Entering 'third_party/benchmark' 2022-08-17T14:07:49.4343590Z Entering 'third_party/cpuinfo' 2022-08-17T14:07:49.4386402Z Entering 'third_party/cub' 2022-08-17T14:07:49.4429197Z Entering 'third_party/cudnn_frontend' 2022-08-17T14:07:49.4478299Z Entering 'third_party/eigen' 2022-08-17T14:07:49.4524101Z Entering 'third_party/fbgemm' 2022-08-17T14:07:49.4566096Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-08-17T14:07:49.4608198Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-08-17T14:07:49.4650879Z Entering 'third_party/fbgemm/third_party/googletest' 2022-08-17T14:07:49.4694414Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-08-17T14:07:49.4737378Z Entering 'third_party/flatbuffers' 2022-08-17T14:07:49.4782528Z Entering 'third_party/fmt' 2022-08-17T14:07:49.4825607Z Entering 'third_party/foxi' 2022-08-17T14:07:49.4867521Z Entering 'third_party/gemmlowp/gemmlowp' 2022-08-17T14:07:49.4909997Z Entering 'third_party/gloo' 2022-08-17T14:07:49.4951983Z Entering 'third_party/googletest' 2022-08-17T14:07:49.4995062Z Entering 'third_party/ideep' 2022-08-17T14:07:49.5037719Z Entering 'third_party/ideep/mkl-dnn' 2022-08-17T14:07:49.5080866Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-08-17T14:07:49.5130169Z Entering 'third_party/ios-cmake' 2022-08-17T14:07:49.5173824Z Entering 'third_party/ittapi' 2022-08-17T14:07:49.5217420Z Entering 'third_party/kineto' 2022-08-17T14:07:49.5259947Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-08-17T14:07:49.5302681Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-08-17T14:07:49.5347119Z Entering 'third_party/nccl/nccl' 2022-08-17T14:07:49.5389197Z Entering 'third_party/neon2sse' 2022-08-17T14:07:49.5432590Z Entering 'third_party/nlohmann' 2022-08-17T14:07:49.5476012Z Entering 'third_party/onnx' 2022-08-17T14:07:49.5532317Z Entering 'third_party/onnx/third_party/benchmark' 2022-08-17T14:07:49.5575501Z Entering 'third_party/onnx/third_party/pybind11' 2022-08-17T14:07:49.5620879Z Entering 'third_party/onnx-tensorrt' 2022-08-17T14:07:49.5663760Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-08-17T14:07:49.5713066Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-08-17T14:07:49.5756703Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-08-17T14:07:49.5799221Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-08-17T14:07:49.5849921Z Entering 'third_party/pocketfft' 2022-08-17T14:07:49.5892287Z Entering 'third_party/protobuf' 2022-08-17T14:07:49.5939131Z Entering 'third_party/protobuf/third_party/benchmark' 2022-08-17T14:07:49.5981990Z Entering 'third_party/protobuf/third_party/googletest' 2022-08-17T14:07:49.6027915Z Entering 'third_party/psimd' 2022-08-17T14:07:49.6071264Z Entering 'third_party/pthreadpool' 2022-08-17T14:07:49.6115826Z Entering 'third_party/pybind11' 2022-08-17T14:07:49.6158844Z Entering 'third_party/python-enum' 2022-08-17T14:07:49.6202897Z Entering 'third_party/python-peachpy' 2022-08-17T14:07:49.6245487Z Entering 'third_party/python-six' 2022-08-17T14:07:49.6289057Z Entering 'third_party/sleef' 2022-08-17T14:07:49.6332541Z Entering 'third_party/tbb' 2022-08-17T14:07:49.6377378Z Entering 'third_party/tensorpipe' 2022-08-17T14:07:49.6420330Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-08-17T14:07:49.6462146Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-08-17T14:07:49.6504972Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-08-17T14:07:49.6549267Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-08-17T14:07:49.6590460Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-08-17T14:07:49.6635882Z Entering 'third_party/zstd' 2022-08-17T14:07:49.6699452Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2022-08-17T14:07:49.6730233Z http.https://github.com/.extraheader 2022-08-17T14:07:49.6741150Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2022-08-17T14:07:49.6781743Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || : 2022-08-17T14:07:49.7108487Z Entering 'android/libs/fbjni' 2022-08-17T14:07:49.7133778Z http.https://github.com/.extraheader 2022-08-17T14:07:49.7168947Z Entering 'third_party/FP16' 2022-08-17T14:07:49.7194306Z http.https://github.com/.extraheader 2022-08-17T14:07:49.7228470Z Entering 'third_party/FXdiv' 2022-08-17T14:07:49.7254056Z http.https://github.com/.extraheader 2022-08-17T14:07:49.7287308Z Entering 'third_party/NNPACK' 2022-08-17T14:07:49.7313197Z http.https://github.com/.extraheader 2022-08-17T14:07:49.7347615Z Entering 'third_party/QNNPACK' 2022-08-17T14:07:49.7372612Z http.https://github.com/.extraheader 2022-08-17T14:07:49.7405789Z Entering 'third_party/XNNPACK' 2022-08-17T14:07:49.7432574Z http.https://github.com/.extraheader 2022-08-17T14:07:49.7478011Z Entering 'third_party/benchmark' 2022-08-17T14:07:49.7504357Z http.https://github.com/.extraheader 2022-08-17T14:07:49.7537657Z Entering 'third_party/cpuinfo' 2022-08-17T14:07:49.7563698Z http.https://github.com/.extraheader 2022-08-17T14:07:49.7597858Z Entering 'third_party/cub' 2022-08-17T14:07:49.7623233Z http.https://github.com/.extraheader 2022-08-17T14:07:49.7656054Z Entering 'third_party/cudnn_frontend' 2022-08-17T14:07:49.7682266Z http.https://github.com/.extraheader 2022-08-17T14:07:49.7722521Z Entering 'third_party/eigen' 2022-08-17T14:07:49.7748548Z http.https://github.com/.extraheader 2022-08-17T14:07:49.7784995Z Entering 'third_party/fbgemm' 2022-08-17T14:07:49.7810257Z http.https://github.com/.extraheader 2022-08-17T14:07:49.7844124Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-08-17T14:07:49.7870177Z http.https://github.com/.extraheader 2022-08-17T14:07:49.7902901Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-08-17T14:07:49.7928834Z http.https://github.com/.extraheader 2022-08-17T14:07:49.7962168Z Entering 'third_party/fbgemm/third_party/googletest' 2022-08-17T14:07:49.7988272Z http.https://github.com/.extraheader 2022-08-17T14:07:49.8020991Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-08-17T14:07:49.8046374Z http.https://github.com/.extraheader 2022-08-17T14:07:49.8079929Z Entering 'third_party/flatbuffers' 2022-08-17T14:07:49.8105848Z http.https://github.com/.extraheader 2022-08-17T14:07:49.8140982Z Entering 'third_party/fmt' 2022-08-17T14:07:49.8166294Z http.https://github.com/.extraheader 2022-08-17T14:07:49.8199174Z Entering 'third_party/foxi' 2022-08-17T14:07:49.8224522Z http.https://github.com/.extraheader 2022-08-17T14:07:49.8257344Z Entering 'third_party/gemmlowp/gemmlowp' 2022-08-17T14:07:49.8282991Z http.https://github.com/.extraheader 2022-08-17T14:07:49.8316191Z Entering 'third_party/gloo' 2022-08-17T14:07:49.8341645Z http.https://github.com/.extraheader 2022-08-17T14:07:49.8374715Z Entering 'third_party/googletest' 2022-08-17T14:07:49.8400220Z http.https://github.com/.extraheader 2022-08-17T14:07:49.8433725Z Entering 'third_party/ideep' 2022-08-17T14:07:49.8459262Z http.https://github.com/.extraheader 2022-08-17T14:07:49.8492128Z Entering 'third_party/ideep/mkl-dnn' 2022-08-17T14:07:49.8517250Z http.https://github.com/.extraheader 2022-08-17T14:07:49.8552308Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-08-17T14:07:49.8577535Z http.https://github.com/.extraheader 2022-08-17T14:07:49.8617996Z Entering 'third_party/ios-cmake' 2022-08-17T14:07:49.8643644Z http.https://github.com/.extraheader 2022-08-17T14:07:49.8676162Z Entering 'third_party/ittapi' 2022-08-17T14:07:49.8701452Z http.https://github.com/.extraheader 2022-08-17T14:07:49.8734853Z Entering 'third_party/kineto' 2022-08-17T14:07:49.8760576Z http.https://github.com/.extraheader 2022-08-17T14:07:49.8794010Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-08-17T14:07:49.8818830Z http.https://github.com/.extraheader 2022-08-17T14:07:49.8852369Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-08-17T14:07:49.8878736Z http.https://github.com/.extraheader 2022-08-17T14:07:49.8914470Z Entering 'third_party/nccl/nccl' 2022-08-17T14:07:49.8939504Z http.https://github.com/.extraheader 2022-08-17T14:07:49.8973433Z Entering 'third_party/neon2sse' 2022-08-17T14:07:49.8998717Z http.https://github.com/.extraheader 2022-08-17T14:07:49.9032266Z Entering 'third_party/nlohmann' 2022-08-17T14:07:49.9057083Z http.https://github.com/.extraheader 2022-08-17T14:07:49.9091724Z Entering 'third_party/onnx' 2022-08-17T14:07:49.9117321Z http.https://github.com/.extraheader 2022-08-17T14:07:49.9165033Z Entering 'third_party/onnx/third_party/benchmark' 2022-08-17T14:07:49.9190646Z http.https://github.com/.extraheader 2022-08-17T14:07:49.9223380Z Entering 'third_party/onnx/third_party/pybind11' 2022-08-17T14:07:49.9249375Z http.https://github.com/.extraheader 2022-08-17T14:07:49.9286229Z Entering 'third_party/onnx-tensorrt' 2022-08-17T14:07:49.9311634Z http.https://github.com/.extraheader 2022-08-17T14:07:49.9344294Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-08-17T14:07:49.9369461Z http.https://github.com/.extraheader 2022-08-17T14:07:49.9408286Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-08-17T14:07:49.9433381Z http.https://github.com/.extraheader 2022-08-17T14:07:49.9469049Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-08-17T14:07:49.9493747Z http.https://github.com/.extraheader 2022-08-17T14:07:49.9527452Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-08-17T14:07:49.9553614Z http.https://github.com/.extraheader 2022-08-17T14:07:49.9592987Z Entering 'third_party/pocketfft' 2022-08-17T14:07:49.9618000Z http.https://github.com/.extraheader 2022-08-17T14:07:49.9651178Z Entering 'third_party/protobuf' 2022-08-17T14:07:49.9676557Z http.https://github.com/.extraheader 2022-08-17T14:07:49.9713759Z Entering 'third_party/protobuf/third_party/benchmark' 2022-08-17T14:07:49.9738225Z http.https://github.com/.extraheader 2022-08-17T14:07:49.9771350Z Entering 'third_party/protobuf/third_party/googletest' 2022-08-17T14:07:49.9797974Z http.https://github.com/.extraheader 2022-08-17T14:07:49.9833806Z Entering 'third_party/psimd' 2022-08-17T14:07:49.9858843Z http.https://github.com/.extraheader 2022-08-17T14:07:49.9891931Z Entering 'third_party/pthreadpool' 2022-08-17T14:07:49.9917080Z http.https://github.com/.extraheader 2022-08-17T14:07:49.9950069Z Entering 'third_party/pybind11' 2022-08-17T14:07:49.9976030Z http.https://github.com/.extraheader 2022-08-17T14:07:50.0011291Z Entering 'third_party/python-enum' 2022-08-17T14:07:50.0037333Z http.https://github.com/.extraheader 2022-08-17T14:07:50.0070103Z Entering 'third_party/python-peachpy' 2022-08-17T14:07:50.0095360Z http.https://github.com/.extraheader 2022-08-17T14:07:50.0127759Z Entering 'third_party/python-six' 2022-08-17T14:07:50.0153473Z http.https://github.com/.extraheader 2022-08-17T14:07:50.0188154Z Entering 'third_party/sleef' 2022-08-17T14:07:50.0212520Z http.https://github.com/.extraheader 2022-08-17T14:07:50.0245456Z Entering 'third_party/tbb' 2022-08-17T14:07:50.0270725Z http.https://github.com/.extraheader 2022-08-17T14:07:50.0305642Z Entering 'third_party/tensorpipe' 2022-08-17T14:07:50.0330618Z http.https://github.com/.extraheader 2022-08-17T14:07:50.0364440Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-08-17T14:07:50.0389399Z http.https://github.com/.extraheader 2022-08-17T14:07:50.0421368Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-08-17T14:07:50.0446500Z http.https://github.com/.extraheader 2022-08-17T14:07:50.0479152Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-08-17T14:07:50.0503947Z http.https://github.com/.extraheader 2022-08-17T14:07:50.0537250Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-08-17T14:07:50.0562793Z http.https://github.com/.extraheader 2022-08-17T14:07:50.0595217Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-08-17T14:07:50.0619554Z http.https://github.com/.extraheader 2022-08-17T14:07:50.0655882Z Entering 'third_party/zstd' 2022-08-17T14:07:50.0681814Z http.https://github.com/.extraheader 2022-08-17T14:07:50.0988627Z Cleaning up orphan processes