Pytorch-version BERT-flow: One can apply BERT-flow to any PLM within Pytorch framework.

Pytorch-bertflow

This is an re-implemented version of BERT-flow using Pytorch framework, which can reproduce the results from the original repo. This code is used to reproduce the results in the TSDAE paper.

Usage

Please refer to the simple example ./example.py

python example.py

Note

  • Please shuffle your training data, which makes a huge difference.
  • The pooling function makes a huge difference in some datasets (especially for the ones used in the paper). To reproduce the results, please use 'first-last-avg'.

Contact

Contact person and main contributor: Kexin Wang, [email protected]

https://www.ukp.tu-darmstadt.de/

https://www.tu-darmstadt.de/

Don't hesitate to send us an e-mail or report an issue, if something is broken (and it shouldn't be) or if you have further questions.

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

Owner
Ubiquitous Knowledge Processing Lab
Ubiquitous Knowledge Processing Lab
Similar Resources

Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated

Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated. This engine can later be used for downstream tasks in NLP such as Q&A, summarization, generation, and natural language understanding (NLU).

Mar 20, 2022

Simple GUI where you can enter an article and get a crisp summarized version.

Text-Summarization-using-TextRank-BART Simple GUI where you can enter an article and get a crisp summarized version. How to run: Clone the repo Instal

Mar 3, 2022

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing πŸŽ‰ πŸŽ‰ πŸŽ‰ We released the 2.0.0 version with TF2 Support. πŸŽ‰ πŸŽ‰ πŸŽ‰ If you

Sep 15, 2022

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing πŸŽ‰ πŸŽ‰ πŸŽ‰ We released the 2.0.0 version with TF2 Support. πŸŽ‰ πŸŽ‰ πŸŽ‰ If you

Feb 9, 2021

An easy-to-use framework for BERT models, with trainers, various NLP tasks and detailed annonations

FantasyBert English | δΈ­ζ–‡ Introduction An easy-to-use framework for BERT models, with trainers, various NLP tasks and detailed annonations. You can imp

Jul 5, 2022

A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any other format

A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any other format

RITA DSL This is a language, loosely based on language Apache UIMA RUTA, focused on writing manual language rules, which compiles into either spaCy co

Apr 27, 2022

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

WaveGlow A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis Quick Start: Install requirements: pip install

Jul 14, 2022

Code for ACL 2021 main conference paper "Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances".

Code for ACL 2021 main conference paper

Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances This repository contains the code and pre-trained mode

Sep 6, 2022

Tensorflow Implementation of A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Tensorflow Implementation of A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Apr 10, 2022
Comments
  • Finetuning all-MiniLM-L6-v2 ValueError

    Finetuning all-MiniLM-L6-v2 ValueError

    Hello thank you for your contribution! I am training to fine-tune the all-MiniLM-L6-v2 (https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) on my data but after the first batch, I get a ValueError and a loss of inf.

    ValueError: Expected value argument (Tensor of shape (4, 192, 1, 1)) to be within the support (Real()) of the distribution Normal(loc: torch.Size([4, 192, 1, 1]), scale: torch.Size([4, 192, 1, 1])), but found invalid values: tensor([[[[nan]],

         [[nan]],
    

    .....

    Here is my very simple script (I just replaced the data and put the training in a loop). The error that I get is:

    
    import pandas as pd
    import numpy as np
    from tflow_utils import TransformerGlow, AdamWeightDecayOptimizer
    from transformers import AutoTokenizer,AutoModel
    
    model_name_or_path = '/tmp/all-MiniLM-L6-v2'
    bertflow = TransformerGlow(model_name_or_path, pooling='mean')  # pooling could be 'mean', 'max', 'cls' or 'first-last-avg' (mean pooling over the first and the last layers)
    tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
    no_decay = ["bias", "LayerNorm.weight"]
    optimizer_grouped_parameters= [
        {
            "params": [p for n, p in bertflow.glow.named_parameters()  \
                            if not any(nd in n for nd in no_decay)],  # Note only the parameters within bertflow.glow will be updated and the Transformer will be freezed during training.
            "weight_decay": 0.01,
        },
        {
            "params": [p for n, p in bertflow.glow.named_parameters()  \
                            if any(nd in n for nd in no_decay)],
            "weight_decay": 0.0,
        },
    ]
    optimizer = AdamWeightDecayOptimizer(
        params=optimizer_grouped_parameters, 
        lr=1e-5, 
        eps=1e-6,
    )
    # Important: Remember to shuffle your training data!!! This makes a huge difference!!!
    
    np.random.seed(0)
    df = pd.read_csv("data/classification/data_small.csv")
    data = df.text.to_list().copy()
    np.random.shuffle(data)
    
    
    bertflow.train()
    batch_size = 4
    nb_batch = int(np.ceil(len(data) / batch_size))
    print(nb_batch)
    for batch_id in range(nb_batch):
        batch = data[batch_id*batch_size:(batch_id+1)*batch_size]
        model_inputs = tokenizer(
            batch,
            add_special_tokens=True,
            return_tensors='pt',
            max_length=256,
            padding='longest',
            truncation=True
        )
        z, loss = bertflow(model_inputs['input_ids'], model_inputs['attention_mask'], return_loss=True)  # Here z is the sentence embedding
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        print(batch_id, loss)
    
    

    Do you have any ideas where this could come from ? I have tried different learning rates but it doesn't solve the problem

Dec 30, 2021
Pytorch version of BERT-whitening

BERT-whitening This is the Pytorch implementation of "Whitening Sentence Representations for Better Semantics and Faster Retrieval". BERT-whitening is

Sep 23, 2022
LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)

LV-BERT Introduction In this repo, we introduce LV-BERT by exploiting layer variety for BERT. For detailed description and experimental results, pleas

Aug 24, 2022
VD-BERT: A Unified Vision and Dialog Transformer with BERT
 VD-BERT: A Unified Vision and Dialog Transformer with BERT

VD-BERT: A Unified Vision and Dialog Transformer with BERT PyTorch Code for the following paper at EMNLP2020: Title: VD-BERT: A Unified Vision and Dia

Sep 1, 2022
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

Sep 24, 2022
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

Feb 18, 2021
PRAnCER is a web platform that enables the rapid annotation of medical terms within clinical notes.

PRAnCER (Platform enabling Rapid Annotation for Clinical Entity Recognition) is a web platform that enables the rapid annotation of medical terms within clinical notes. A user can highlight spans of text and quickly map them to concepts in large vocabularies within a single, intuitive platform.

Sep 20, 2022
Official source for spanish Language Models and resources made @ BSC-TEMU within the "Plan de las TecnologΓ­as del Lenguaje" (Plan-TL).

Spanish Language Models ???? Corpora ?? Corpora Number of documents Size (GB) BNE 201,080,084 570GB Models ?? RoBERTa-base BNE: https://huggingface.co

Jul 27, 2022