aplewe commited on
Commit
aa413e2
1 Parent(s): 24bd18f

added link to samples folder readme

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -5,6 +5,8 @@ license: creativeml-openrail-m
5
 
6
  ![](./samples/5.png)
7
 
 
 
8
  Greetings, Internet. The purpose of the Storytime model is to create a Stable Diffusion model that "communicates" with text and images. My focus currently is on the English language, but I believe any of the techniques that I apply are applicable to other written languages as well. Inasmuch as "language" encompasses a huge range of notions and ideas (many of which I am probably ignorant of as a hobbyist), the purpose specifically is to use undifferentiated image data to "teach" Stable Diffusion about various language concepts, then see if it can adopt them and apply them to images.
9
 
10
  To give you a sense of what I mean, this model is Stable Diffusion v1.5 fine-tuned using Dreambooth on a small dataset of alphabet flashcards, simple word lists, grammatical concepts presented visually via charts, images of pages of text, and images of text along with picutres, all in English. However, the letters, words, sentences, paragraphs, and other mechanics are not specifically identified in the captions. A flashcard showing the letter "C", for instance, will have a generic caption such as "a picture of a letter from the English alphabet". But the letter itself is not identified. An image of a list of common sight-words will have a caption similar to "A picture showing common words in the English language written using the English alphabet". And so on. At some point in time I will publish the data I use for training, but I want to be sure about sourcing and attribution and any issues around those (and rectify any that may be problematic) before publishing.
 
5
 
6
  ![](./samples/5.png)
7
 
8
+ more samples [here](./samples/README.md)
9
+
10
  Greetings, Internet. The purpose of the Storytime model is to create a Stable Diffusion model that "communicates" with text and images. My focus currently is on the English language, but I believe any of the techniques that I apply are applicable to other written languages as well. Inasmuch as "language" encompasses a huge range of notions and ideas (many of which I am probably ignorant of as a hobbyist), the purpose specifically is to use undifferentiated image data to "teach" Stable Diffusion about various language concepts, then see if it can adopt them and apply them to images.
11
 
12
  To give you a sense of what I mean, this model is Stable Diffusion v1.5 fine-tuned using Dreambooth on a small dataset of alphabet flashcards, simple word lists, grammatical concepts presented visually via charts, images of pages of text, and images of text along with picutres, all in English. However, the letters, words, sentences, paragraphs, and other mechanics are not specifically identified in the captions. A flashcard showing the letter "C", for instance, will have a generic caption such as "a picture of a letter from the English alphabet". But the letter itself is not identified. An image of a list of common sight-words will have a caption similar to "A picture showing common words in the English language written using the English alphabet". And so on. At some point in time I will publish the data I use for training, but I want to be sure about sourcing and attribution and any issues around those (and rectify any that may be problematic) before publishing.