Commits · github/fork/xhluca/modified-constants · Administrator / metaseq

03 Jul, 2022 1 commit
- Merge branch 'facebookresearch:main' into modified-constants · d9d76965
  Xing Han Lu authored 3 years ago
  
  d9d76965
30 Jun, 2022 1 commit

[api] Remove beam search (#187) · 42884515

Stephen Roller authored 3 years ago

* Greedy is implemented.

* Add early stopping, sort beams by probability

* Pass on temperature from UI. Early stopping.

* Kill some more dead code.

* Typo

* Fix lint.

42884515

27 Jun, 2022 3 commits

Update api.md to mention Alpa integration (#188) · 00cb4be9
devonwp authored 3 years ago

00cb4be9
Merge branch 'facebookresearch:main' into modified-constants · c569d364
Xing Han Lu authored 3 years ago

c569d364

Add Aim logging to progress_bar (#92) · 4c1e004a

Albert Torosyan authored 3 years ago


* Add Aim logging to progress_bar

* Add Aim logging arguments to validate() method progress_bar

* Add aim package to install requirements

* [fix] Add imports from base_progress_bar

* Add Aim usage mini-guide

* Add len method to AimProgressBarWrapper

* Fix linter issues

Co-authored-by: Gor Arakelyan <arakelyangor10@gmail.com>

4c1e004a

24 Jun, 2022 2 commits
- Revert "[api] Fix singular positive logit (#167)" (#182) · 7e834563
  Stephen Roller authored 3 years ago
```
This reverts commit 1aa40c2a.
```
  7e834563
- [Community] Add OPT-Alpa integration (#179) · 16d4efab
  Hao Zhang authored 3 years ago
  
  16d4efab
23 Jun, 2022 4 commits
- Merge branch 'facebookresearch:main' into modified-constants · b805b189
  Xing Han Lu authored 3 years ago
  
  b805b189
- updating codeowners to recent active users (#178) · 0af5b4b5
  Susan Zhang authored 3 years ago
  
  0af5b4b5
- added changes for resharding MP parts (#169) · c66f25ed
  ngoyal2707 authored 3 years ago
```
* added changes for resharding MP parts

* changes

* changes

* changes
```
  c66f25ed
- Update CODEOWNERS (#177) · 5cc79b1b
  Kurt Shuster authored 3 years ago
  
  5cc79b1b
22 Jun, 2022 4 commits
- [api] More temperature woes (#172) · 7946caf0
  Stephen Roller authored 3 years ago
  
  7946caf0
- OPT-66B Release (#171) · 84c412fb
  Susan Zhang authored 3 years ago
```
* 66b weights, md5sum

* add baselines logbook

* add link to arxiv, link to chronicles on project readme
```
  84c412fb
- [api] Support temp=0. Return errors as json. (#168) · fc037c7a
  Stephen Roller authored 3 years ago
```
* [api] Support temp=0. Return errors as json.

* Update hub_utils.py
```
  fc037c7a
- [api] Fix singular positive logit (#167) · 1aa40c2a
  Stephen Roller authored 3 years ago
```
* Save work

* Correctly handle.
```
  1aa40c2a
21 Jun, 2022 4 commits

Change default from int to str · 6fc6462c
Xing Han Lu authored 3 years ago

6fc6462c
Move BPE_FOLDER outside of `try/except` · 6dd38212
Xing Han Lu authored 3 years ago

6dd38212
Add a BPE folder · 23a27c50
Xing Han Lu authored 3 years ago

23a27c50

Singleton checkpoint needs to include decoder.version for single-ton... · f5442a18

Patrick von Platen authored 3 years ago

Singleton checkpoint needs to include decoder.version for single-ton checkpoint to run correctly (#164)

* Singleton checkpoint needs to include decoder.version

If we don't transfer the `"decoder.version"` to the singleton checkpoint, a very sneaky bug happens which was found by @thomasw21 as part of this PR: https://github.com/huggingface/transformers/pull/17785

If the `decoder.version` param is not present in the state_dict it follows that upon loading the single-ton checkpoint the loaded layer_norm is set to `None` here: https://github.com/facebookresearch/metaseq/blob/e0c4f6b0e4c523906ad8d561f727e3f2ac3a8e73/metaseq/models/transformer.py#L932

So it's absolutely crucial that we include this variable.

I will update all of the converted HF checkpoints here later today and then I think we can be sure that OPT works correctly :partying_face: 
https://huggingface.co/models?other=opt_metasq



* Update convert_to_singleton.py

Co-authored-by: Stephen Roller <roller@fb.com>

f5442a18

20 Jun, 2022 4 commits

Wrap with "int" · fe7d7e75
Xing Han Lu authored 3 years ago

fe7d7e75
Change constants to use environment variables when possible · c0afbd37
Xing Han Lu authored 3 years ago

c0afbd37
Add missing Namespace import (#126) · e0c4f6b0
Victoria X Lin authored 3 years ago
```
* fix Namespace missing import

* add noqa comment to Namespace import line
```
e0c4f6b0

Support document level attention while training model parallel models (#104) · 056a1116

Punit Singh Koura authored 3 years ago


* document attention masking first commit

* Fixing bugs, formatting changes

* Fixing bug

Addressing comments, adding documentation

Fixing comments

Fixing lint

Fixing positional embedding reset

Bug fix for positional embeddings

Adding comments

Co-authored-by: Ramakanth Pasunuru <ramakanth.1729@gmail.com>

056a1116

18 Jun, 2022 1 commit
- [api] Simplify launch config. (#157) · 7c8427a6
  Stephen Roller authored 3 years ago
  
  7c8427a6
17 Jun, 2022 2 commits

Fix Beam Search (#156) · cf24413b

Kurt Shuster authored 3 years ago


* fix beam search

* add whitespace back

* Update test_sequence_generator.py

Co-authored-by: Stephen Roller <roller@fb.com>

cf24413b

fix off by one in API (#145) · ca180bb6

lilisierrayu authored 3 years ago

* making prompt_len independent of batchfy implmentation

* modify for echo=True case

* returning logprobs, to support logprob input

* setting need_logprobs depends on each request to save memory

ca180bb6

13 Jun, 2022 1 commit

Fix progress bar exception when both tensorboard and wandb are turned on (#148) · 69b16560

Victoria X Lin authored 3 years ago

* fix progress bar exception when both tensorboard and wandb are turned on

* prioritize wandb progress bar over tensorboard progress bar when both are set

69b16560

06 Jun, 2022 1 commit
- Add a 2nd md5sum check (#141) · ae825b2f
  Xing Han Lu authored 3 years ago
  
  ae825b2f
04 Jun, 2022 1 commit
- md5sum of shards (#138) · 000394b5
  Susan Zhang authored 3 years ago
  
  000394b5
02 Jun, 2022 2 commits

Add eos mode for src tgt dataset (#107) · 72dcac44
Srinivasan Iyer authored 3 years ago
```
* Add eos mode for src tgt dataset

* lint

* Addressed comments

* lint
```
72dcac44

Fix errors when build docker (#130) · 821b43af

QIU Shuo authored 3 years ago


* fix docker build fails

* revert metaseq branch tomain in dockerfile

* limit max version of hydra-core

Co-authored-by: qiushuo <qiushuo@microsoft.com>

821b43af

01 Jun, 2022 1 commit
- Add instructions to download the GPT2 asset files (#132) · d04780e0
  Xing Han Lu authored 3 years ago
  
  d04780e0
31 May, 2022 2 commits

BF16 support (#30) · 1639c607
ngoyal2707 authored 3 years ago

1639c607

Progress bar cleanup (#96) · b972b949

Susan Zhang authored 3 years ago

* remove unused build_progress_bar method

* remove noop/none logging as a log format option

* make log_format default to json and remove default_log_format arg

* move progress_bar to submodule

* rename to base_progress_bar

* module rename

* move out json progress bar

* move out tensorboard progress bar

* move out wandb progress wrapper

* split out helpers to utils, avoid circular import

* fix broken init

* remove module rename for progress bar

* move get_precise_epoch to higher level utils

* lint

* remove no progress bar flag

* fix more imports

* cleanup

* remove utils

* add license

b972b949

30 May, 2022 1 commit
- Pin omegaconf to 2.1.1, tensorboard to 2.8.0 (#127) · d04842c1
  Susan Zhang authored 3 years ago
```
* pin omegaconf to 2.1.1

* pin tensorboard to 2.8.0

* protobuf pin to 3.20.1
```
  d04842c1
24 May, 2022 2 commits
- Add Dockerfile (#120) · c190406e
  Peter Salanki authored 3 years ago
```
Based on the setup instructions
```
  c190406e
- Add Example Proportional Sampling Maximum (#109) · 31024523
  Srinivasan Iyer authored 3 years ago
```
* Add maximum for example proportional sampling

* lint

* Addressed comments
```
  31024523
23 May, 2022 1 commit
- Include diff with black check (#116) · 6cd6a708
  Susan Zhang authored 3 years ago
```
* include diff with black check

* fix lint
```
  6cd6a708
22 May, 2022 1 commit

Dynamic loss scaler changes for 66b (#115) · 0ba5657a

Susan Zhang authored 3 years ago

* set defaults for scale window, and init scale

* remove more default logic for scale window

* also decrease scale window with loss scale

* remove floating point error for hitting min loss scale

* also log out scale_window

* 0.03125 -> 2 ** -5

0ba5657a

20 May, 2022 1 commit
- Allow downloading to a target directory (#113) · 474c1f20
  Stella Biderman authored 3 years ago
  
  474c1f20