TransformationTransformer
TransformationTransformer is a fine-tuned distilroberta model. It is trained and evaluated on 10,000 manually annotated sentences gleaned from the Q&A-section of quarterly earnings conference calls. In particular, it was trained on sentences issued by firm executives to discriminate between setnences that allude to business transformation vis-à-vis those that discuss topics other than business transformations. More details about the training procedure can be found below.
Background
Context on the project.
Usage
The model is intented to be used for sentence classification: It creates a contextual text representation from the input sentence and outputs a probability value. LABEL_1
refers to a sentence that is predicted to contains transformation-related content (vice versa for LABEL_0
). The query should consist of a single sentence.
Usage (API)
import json
import requests
API_TOKEN = <TOKEN>
headers = {"Authorization": f"Bearer {API_TOKEN}"}
API_URL = "https://api-inference.huggingface.co/models/simonschoe/call2vec"
def query(payload):
data = json.dumps(payload)
response = requests.request("POST", API_URL, headers=headers, data=data)
return json.loads(response.content.decode("utf-8"))
query({"inputs": "<insert-sentence-here>"})
Usage (transformers)
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("simonschoe/TransformationTransformer")
model = AutoModelForSequenceClassification.from_pretrained("simonschoe/TransformationTransformer")
classifier = pipeline('text-classification', model=model, tokenizer=tokenizer)
classifier('<insert-sentence-here>')
Model Training
The model has been trained on text data stemming from earnings call transcripts. The data is restricted to a call's question-and-answer (Q&A) section and the remarks by firm executives. The data has been segmented into individual sentences using spacy
.
Statistics of Training Data:
- Labeled sentences: 10,000
- Data distribution: xxx
- Inter-coder agreement: xxx
The following code snippets presents the training pipeline:
- Downloads last month
- 24