<abbr id="y2asm"></abbr><abbr id="y2asm"></abbr>
  • <code id="y2asm"></code>
    <code id="y2asm"></code>
  • <button id="y2asm"></button>
    <rt id="y2asm"></rt>
    Skip to content

    microsoft/MPNet

    Repository files navigation

    MPNet

    MPNet: Masked and Permuted Pre-training for Language Understanding, by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu, is a novel pre-training method for language understanding tasks. It solves the problems of MLM (masked language modeling) in BERT and PLM (permuted language modeling) in XLNet and achieves better accuracy.

    News: We have updated the pre-trained models now.

    Supported Features

    • A unified view and implementation of several pre-training models including BERT, XLNet, MPNet, etc.
    • Code for pre-training and fine-tuning for a variety of language understanding (GLUE, SQuAD, RACE, etc) tasks.

    Installation

    We implement MPNet and this pre-training toolkit based on the codebase of fairseq. The installation is as follow:

    pip install --editable pretraining/
    pip install pytorch_transformers==1.0.0 transformers scipy sklearn
    

    Pre-training MPNet

    Our model is pre-trained with bert dictionary, you first need to pip install transformers to use bert tokenizer. We provide a script encode.py and a dictionary file dict.txt to tokenize your corpus. You can modify encode.py if you want to use other tokenizers (like roberta).

    1) Preprocess data

    We choose WikiText-103 as a demo. The running script is as follow:

    wget https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-103-raw-v1.zip
    unzip wikitext-103-raw-v1.zip
    
    for SPLIT in train valid test; do \
        python MPNet/encode.py \
            --inputs wikitext-103-raw/wiki.${SPLIT}.raw \
            --outputs wikitext-103-raw/wiki.${SPLIT}.bpe \
            --keep-empty \
            --workers 60; \
    done
    

    Then, we need to binarize data. The command of binarizing data is following:

    fairseq-preprocess \
        --only-source \
        --srcdict MPNet/dict.txt \
        --trainpref wikitext-103-raw/wiki.train.bpe \
        --validpref wikitext-103-raw/wiki.valid.bpe \
        --testpref wikitext-103-raw/wiki.test.bpe \
        --destdir data-bin/wikitext-103 \
        --workers 60
    

    2) Pre-train MPNet

    The below command is to train a MPNet model:

    TOTAL_UPDATES=125000    # Total number of training steps
    WARMUP_UPDATES=10000    # Warmup the learning rate over this many updates
    PEAK_LR=0.0005          # Peak learning rate, adjust as needed
    TOKENS_PER_SAMPLE=512   # Max sequence length
    MAX_POSITIONS=512       # Num. positional embeddings (usually same as above)
    MAX_SENTENCES=16        # Number of sequences per batch (batch size)
    UPDATE_FREQ=16          # Increase the batch size 16x
    
    DATA_DIR=data-bin/wikitext-103
    
    fairseq-train --fp16 $DATA_DIR \
        --task masked_permutation_lm --criterion masked_permutation_cross_entropy \
        --arch mpnet_base --sample-break-mode complete --tokens-per-sample $TOKENS_PER_SAMPLE \
        --optimizer adam --adam-betas '(0.9,0.98)' --adam-eps 1e-6 --clip-norm 0.0 \
        --lr-scheduler polynomial_decay --lr $PEAK_LR --warmup-updates $WARMUP_UPDATES --total-num-update $TOTAL_UPDATES \
        --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01 \
        --max-sentences $MAX_SENTENCES --update-freq $UPDATE_FREQ \
        --max-update $TOTAL_UPDATES --log-format simple --log-interval 1 --input-mode 'mpnet'
    

    Notes: You can replace arch with mpnet_rel_base and add command --mask-whole-words --bpe bert to use relative position embedding and whole word mask.

    Notes: You can specify --input-mode as mlm or plm to train masked language model or permutation language model.

    Pre-trained models

    We have updated the final pre-trained MPNet model for fine-tuning.

    You can load the pre-trained MPNet model like this:

    from fairseq.models.masked_permutation_net import MPNet
    mpnet = MPNet.from_pretrained('checkpoints', 'checkpoint_best.pt', 'path/to/data', bpe='bert')
    assert isinstance(mpnet.model, torch.nn.Module)

    Fine-tuning MPNet on down-streaming tasks

    Acknowledgements

    Our code is based on fairseq-0.8.0. Thanks for their contribution to the open-source commuity.

    Reference

    If you find this toolkit useful in your work, you can cite the corresponding papers listed below:

    @article{song2020mpnet,
        title={MPNet: Masked and Permuted Pre-training for Language Understanding},
        author={Song, Kaitao and Tan, Xu and Qin, Tao and Lu, Jianfeng and Liu, Tie-Yan},
        journal={arXiv preprint arXiv:2004.09297},
        year={2020}
    }
    

    Related Works

    About

    MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf

    Resources

    License

    Code of conduct

    Security policy

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

    No packages published

    Contributors 3

    •  
    •  
    •  
    主站蜘蛛池模板: 亚洲国产欧美国产综合一区| 中文字幕色网站| 欧美xxxx新一区二区三区| 亚洲视频456| 秦先生第15部大战宝在线观看 | 制服丝袜第六页| 蜜桃AV噜噜一区二区三区| 国产大片b站免费观看推荐| 日本三级韩国三级欧美三级| 国产精品福利一区| 99久久99视频| 大学生一级特黄的免费大片视频 | 中文字幕乱码系列免费| 菠萝蜜亏亏带痛声的视频| 国产熟睡乱子伦午夜视频| 1a级毛片免费观看| 国产调教在线观看| 99九九精品免费视频观看| 天天做人人爱夜夜爽2020毛片| www免费插插视频| 好吊妞乱淫欧美| 一二三四视频在线观看韩国电视剧| 成人欧美一区二区三区在线观看| 久9这里精品免费视频| 无遮无挡爽爽免费视频| 久久www视频| 无遮挡h肉动漫在线观看日本| 久久久久夜夜夜精品国产| 日本猛少妇色xxxxx猛交| 久久国产精品久久久久久| 日韩中文字幕在线播放| 久久精品国产99国产精品| 日韩在线视频二区| 久久天天躁狠狠躁夜夜不卡| 日本簧片在线观看| 久久亚洲最大成人网4438| 日本亚洲欧美在线视观看| 久久久久久久综合狠狠综合| 日批视频在线免费观看| 中文字幕无码精品三级在线电影 | 国模精品一区二区三区视频|