diff --git a/docs/api.md b/docs/api.md index c5e77ff94dadc497e18c096b65d991d685f1fef7..1e57aaaa4e3219413cf3829d931fe53e14dc8237 100644 --- a/docs/api.md +++ b/docs/api.md @@ -18,7 +18,7 @@ Complete all of the setup as mentioned in [the Setup doc](setup.md). - Reshard the FSDP checkpoints using the script `metaseq/scripts/reshard_fsdp.py`. For example, we can merge all FSDP shards within each of the 8 model parallel parts of OPT-175B using the following command: ```bash for j in {0..7}; do - python -m metaseq.scripts.reshard_fsdp + python -m metaseq.scripts.reshard_fsdp \ --input-glob-pattern "/path/to/raw/checkpoints/checkpoint_last-model_part-$j-shard*.pt" \ --output-shard-name "/path/to/resharded/checkpoints/reshard-model_part-$j.pt" \ --num-output-shards 1 --skip-optimizer-state True --unflatten-weights True diff --git a/projects/OPT/download_opt175b.md b/projects/OPT/download_opt175b.md index b369016b40a1cf527c026e6d04dbb5c56dba4d5f..1ed3f272ab19753e6b65f5b4691bcdd003940f06 100644 --- a/projects/OPT/download_opt175b.md +++ b/projects/OPT/download_opt175b.md @@ -35,7 +35,7 @@ md5sum * To consolidate the 992 shards into 8 files model-parallel evaluation, run the `metaseq.scripts.reshard_fsdp` script: ```bash for j in {0..7}; do - python -m metaseq.scripts.reshard_fsdp + python -m metaseq.scripts.reshard_fsdp \ --input-glob-pattern "/path/to/raw/checkpoints/checkpoint_last-model_part-$j-shard*.pt" \ --output-shard-name "/path/to/resharded/checkpoints/reshard-model_part-$j.pt" \ --num-output-shards 1 --skip-optimizer-state True --unflatten-weights True