MyneFactory
/

MF-Base

@@ -56,12 +56,23 @@ license: creativeml-openrail-m
   <h2 style="font-size:28px; font-family: Arial, Helvetica, sans-serif; font-weight: bold; margin:0; color: #222;">Model Info</h2>
   <p style="font-size: 18px;  color: #666;">
     <strong>Downloads: </strong>
-    <a style="color: #333" href="https://huggingface.co/MyneFactory/MF-Base/blob/main/Full%20release%20models/MyneFactoryBase%20V1.0.ckpt">MyneFactoryBase V1.0.ckpt</a>,
   </p>
   <p style="font-size: 18px; color: #666;">
-    <strong>Technical: </strong>
-    <span>The model was trained on 23,088 samples from Yande.re using a base model NAI, with a resolution of 768 and 512 pixels using aspect ratio buckets to prevent unnatural cropping. The model was trained for 90 epochs (500,000+ steps) using a stepped cosine and stepped constant LR. The LR was manually stepped every 10 epochs and was estimated using proportionality when estimating LR for each additional batch size. The training was done on an RTX 4090 with EveryDream 2 Trainer using DDIM sample scheduler and DDPM noise scheduler, with Adam8bit optimizer, mix precision, and xformers enabled. A conditional dropout of tags was set to 0.125. Verification testing was done on various milestones, and logs were monitored via Tensorboard. </span>
   </p>
   <p style="font-size: 18px; color: #666;">
     <strong>Authors: </strong><span>Juusoz, 金Goldkoron, tsmkirby</span>
   </p>

   <h2 style="font-size:28px; font-family: Arial, Helvetica, sans-serif; font-weight: bold; margin:0; color: #222;">Model Info</h2>
   <p style="font-size: 18px;  color: #666;">
     <strong>Downloads: </strong>
+    <a style="color: #333" href="https://huggingface.co/MyneFactory/MF-Base/blob/main/Full%20release%20models/MyneFactoryBase%20V1.0.ckpt">MyneFactoryBase V1.0.ckpt</a>
   </p>
+  <!-- Technical details start here -->
+  <h3 style="font-size:24px; font-family: Arial, Helvetica, sans-serif; font-weight: bold; margin:20px 0; color: #222;">Technical Details</h3>
+  <h4 style="font-size:20px; font-family: Arial, Helvetica, sans-serif; font-weight: bold; margin:20px 0; color: #222;">Model Training</h4>
   <p style="font-size: 18px; color: #666;">
+    MyneFactoryBase was trained using 23,088 samples from Yande.re. File captions were generated using 3 iterations of WD1.4 tagger to ensure maximum identification of objects within the training data. A second captioning run was done using one tagger with a reduced threshold to produce shorter captions for later use. The model was trained using the NAI model as the base, and the Adam optimizer was used with a manually set maximum learning rate and cosine decay. Training was done on an RTX 4090 with a batch size of 4, utilizing DDIM sample scheduler and DDPM noise scheduler with mix precision.
   </p>
+  <h4 style="font-size:20px; font-family: Arial, Helvetica, sans-serif; font-weight: bold; margin:20px 0; color: #222;">Text Encoder Training</h4>
+  <p style="font-size: 18px; color: #666;">
+    Text Encoder was trained for 50% of the training durations, freezing and unfreezing every 10ep. During the final 20ep of finetuning, the TE was frozen.
+  </p>
+  <h4 style="font-size:20px; font-family: Arial, Helvetica, sans-serif; font-weight: bold; margin:20px 0; color: #222;">Block Merge</h4>
+  <p style="font-size: 18px; color: #666;">
+    At the ep20 milestone, a block merge was done with BasilMix. However, it was evident that the merged weights were being trained out quickly, and the weights had entirely shifted back to the training data by the end of the training. Ultimately, the decision was made to not use a block merge for the final release.
+  </p>
+  <!-- Technical details end here -->
   <p style="font-size: 18px; color: #666;">
     <strong>Authors: </strong><span>Juusoz, 金Goldkoron, tsmkirby</span>
   </p>