Spaces:

etweedy
/

movie_review_v2

Sleeping

App Files Files Community

movie_review_v2 / README.md

etweedy

Update README.md

f0b1b4d almost 2 years ago

preview code

raw

history blame

No virus

2.89 kB

	---
	title: Movie Review V2
	emoji: 🚀
	colorFrom: indigo
	colorTo: green
	sdk: gradio
	sdk_version: 3.15.0
	app_file: app.py
	pinned: false
	license: apache-2.0
	---

	Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

	This app asks the user to input a movie review as a text string, and predicts whether the sentiment of the review is 'Positive' or 'Negative'.

	The prediction is made using a Bidirectional Encoder Representations from Transformers (BERT) model, namely a fine-tuned version of DistilBERT (https://arxiv.org/abs/1910.01108)

	We started with the DistilBertForSequenceClassification pre-trained model in the Hugging Face transformers library (https://huggingface.co/docs/transformers/v4.25.1/en/model_doc/distilbert#transformers.DistilBertForSequenceClassification) and DistilBertTokenizerFast (https://huggingface.co/docs/transformers/v4.25.1/en/model_doc/distilbert#transformers.DistilBertTokenizerFast)

	We fine-tuned the model using the IMDb Large Movie Review Dataset (https://ai.stanford.edu/~amaas/data/sentiment/) for 3 epochs using batch sizes of 16 samples and a learning rate of 1e-5. The loss and accuracy progressed as follows:

	Epoch 0001 of 0003, batch 0000 of 2188 === Loss: 0.6800

	Epoch 0001 of 0003, batch 0250 of 2188 === Loss: 0.2488

	Epoch 0001 of 0003, batch 0500 of 2188 === Loss: 0.4501

	Epoch 0001 of 0003, batch 0750 of 2188 === Loss: 0.1309

	Epoch 0001 of 0003, batch 1000 of 2188 === Loss: 0.4273

	Epoch 0001 of 0003, batch 1250 of 2188 === Loss: 0.3193

	Epoch 0001 of 0003, batch 1500 of 2188 === Loss: 0.5093

	Epoch 0001 of 0003, batch 1750 of 2188 === Loss: 0.4583

	Epoch 0001 of 0003, batch 2000 of 2188 === Loss: 0.3154

	Training accuracy: 96.62 === Valid accuracy: 92.54

	Epoch 0002 of 0003, batch 0000 of 2188 === Loss: 0.1179

	Epoch 0002 of 0003, batch 0250 of 2188 === Loss: 0.0136

	Epoch 0002 of 0003, batch 0500 of 2188 === Loss: 0.1435

	Epoch 0002 of 0003, batch 0750 of 2188 === Loss: 0.0454

	Epoch 0002 of 0003, batch 1000 of 2188 === Loss: 0.0768

	Epoch 0002 of 0003, batch 1250 of 2188 === Loss: 0.2802

	Epoch 0002 of 0003, batch 1500 of 2188 === Loss: 0.0200

	Epoch 0002 of 0003, batch 1750 of 2188 === Loss: 0.1257

	Epoch 0002 of 0003, batch 2000 of 2188 === Loss: 0.1308

	Training accuracy: 98.76 === Valid accuracy: 92.46

	Epoch 0003 of 0003, batch 0000 of 2188 === Loss: 0.0074

	Epoch 0003 of 0003, batch 0250 of 2188 === Loss: 0.0039

	Epoch 0003 of 0003, batch 0500 of 2188 === Loss: 0.0611

	Epoch 0003 of 0003, batch 0750 of 2188 === Loss: 0.0306

	Epoch 0003 of 0003, batch 1000 of 2188 === Loss: 0.1513

	Epoch 0003 of 0003, batch 1250 of 2188 === Loss: 0.0014

	Epoch 0003 of 0003, batch 1500 of 2188 === Loss: 0.0020

	Epoch 0003 of 0003, batch 1750 of 2188 === Loss: 0.1905

	Epoch 0003 of 0003, batch 2000 of 2188 === Loss: 0.1545

	Training accuracy: 99.43 === Valid accuracy: 92.38