<p>Inspiration drawn from <a href="">TaskMartix aka Visual ChatGPT</a></p> | |
<hr> | |
<p><a href="" target="_blank" style="float:right; font-size:smaller">source</a></p> | |
<section id="format_image" class="level3"> | |
<h3 class="anchored" data-anchor-id="format_image">format_image</h3> | |
<blockquote class="blockquote"> | |
<pre><code> format_image (image:str)</code></pre> | |
</blockquote> | |
<table class="table"> | |
<thead> | |
<tr class="header"> | |
<th></th> | |
<th><strong>Type</strong></th> | |
<th><strong>Details</strong></th> | |
</tr> | |
</thead> | |
<tbody> | |
<tr class="odd"> | |
<td>image</td> | |
<td>str</td> | |
<td>Image file path</td> | |
</tr> | |
</tbody> | |
</table> | |
<hr> | |
<p><a href="" target="_blank" style="float:right; font-size:smaller">source</a></p> | |
</section> | |
<section id="blipimagecaptioning" class="level3"> | |
<h3 class="anchored" data-anchor-id="blipimagecaptioning">BlipImageCaptioning</h3> | |
<blockquote class="blockquote"> | |
<pre><code> BlipImageCaptioning (device:str)</code></pre> | |
</blockquote> | |
<p>Useful when you want to know what is inside the photo.</p> | |
<hr> | |
<p><a href="" target="_blank" style="float:right; font-size:smaller">source</a></p> | |
</section> | |
<section id="blipimagecaptioning.inference" class="level3"> | |
<h3 class="anchored" data-anchor-id="blipimagecaptioning.inference">BlipImageCaptioning.inference</h3> | |
<blockquote class="blockquote"> | |
<pre><code> BlipImageCaptioning.inference | |
(image:<module'PIL.Image'from'/home/evylz/ | |
AnimalEquality/lv-recipe- | |
chatbot/env/lib/python3.10/site- | |
packages/PIL/'>)</code></pre> | |
</blockquote> | |
<table class="table"> | |
<thead> | |
<tr class="header"> | |
<th></th> | |
<th><strong>Type</strong></th> | |
<th><strong>Details</strong></th> | |
</tr> | |
</thead> | |
<tbody> | |
<tr class="odd"> | |
<td>image</td> | |
<td>PIL.Image</td> | |
<td></td> | |
</tr> | |
<tr class="even"> | |
<td><strong>Returns</strong></td> | |
<td><strong>str</strong></td> | |
<td><strong>Caption for the image</strong></td> | |
</tr> | |
</tbody> | |
</table> | |
<hr> | |
<p><a href="" target="_blank" style="float:right; font-size:smaller">source</a></p> | |
</section> | |
<section id="blipvqa" class="level3"> | |
<h3 class="anchored" data-anchor-id="blipvqa">BlipVQA</h3> | |
<blockquote class="blockquote"> | |
<pre><code> BlipVQA (device:str)</code></pre> | |
</blockquote> | |
<p>BLIP Visual Question Answering Useful when you need an answer for a question based on an image. Examples: what is the background color of this image, how many cats are in this figure, what is in this figure?</p> | |
<hr> | |
<p><a href="" target="_blank" style="float:right; font-size:smaller">source</a></p> | |
</section> | |
<section id="blipvqa.inference" class="level3"> | |
<h3 class="anchored" data-anchor-id="blipvqa.inference">BlipVQA.inference</h3> | |
<blockquote class="blockquote"> | |
<pre><code> BlipVQA.inference | |
(image:<module'PIL.Image'from'/home/evylz/AnimalEquali | |
ty/lv-recipe-chatbot/env/lib/python3.10/site- | |
packages/PIL/'>, question:str)</code></pre> | |
</blockquote> | |
<table class="table"> | |
<thead> | |
<tr class="header"> | |
<th></th> | |
<th><strong>Type</strong></th> | |
<th><strong>Details</strong></th> | |
</tr> | |
</thead> | |
<tbody> | |
<tr class="odd"> | |
<td>image</td> | |
<td>PIL.Image</td> | |
<td></td> | |
</tr> | |
<tr class="even"> | |
<td>question</td> | |
<td>str</td> | |
<td></td> | |
</tr> | |
<tr class="odd"> | |
<td><strong>Returns</strong></td> | |
<td><strong>str</strong></td> | |
<td><strong>Answer to the query on the image</strong></td> | |
</tr> | |
</tbody> | |
</table> | |
<div class="cell"> | |
<div class="sourceCode cell-code" id="cb6"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>sample_images <span class="op">=</span> os.listdir(SAMPLE_IMG_DIR)</span> | |
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a>sample_images</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div> | |
<div class="cell-output cell-output-display"> | |
<pre><code>['veggie-fridge.jpeg', | |
'veg-groceries-table.jpg', | |
'fridge-splendid.jpg', | |
'neat-veg-groceries.jpg', | |
'veg-groceries-table.jpeg', | |
'Fruits-and-vegetables-one-a-table.jpg']</code></pre> | |
</div> | |
</div> | |
<div class="cell"> | |
<div class="sourceCode cell-code" id="cb8"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> img <span class="kw">in</span> sample_images:</span> | |
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a> display(format_image(SAMPLE_IMG_DIR <span class="op">/</span> img))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-8-output-1.png" class="img-fluid"></p> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-8-output-2.png" class="img-fluid"></p> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-8-output-3.png" class="img-fluid"></p> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-8-output-4.png" class="img-fluid"></p> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-8-output-5.png" class="img-fluid"></p> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-8-output-6.png" class="img-fluid"></p> | |
</div> | |
</div> | |
<p>The process:</p> | |
<ol type="1"> | |
<li>Format image</li> | |
<li>Get description (caption)</li> | |
<li>Pass caption and ingredient queries to VQA</li> | |
</ol> | |
<div class="cell"> | |
<div class="sourceCode cell-code" id="cb9"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a>vqa <span class="op">=</span> BlipVQA(<span class="st">"cpu"</span>)</span> | |
<span id="cb9-2"><a href="#cb9-2" aria-hidden="true" tabindex="-1"></a>img_cap <span class="op">=</span> BlipImageCaptioning(<span class="st">"cpu"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div> | |
</div> | |
<div class="cell"> | |
<div class="sourceCode cell-code" id="cb10"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> img <span class="kw">in</span> sample_images:</span> | |
<span id="cb10-2"><a href="#cb10-2" aria-hidden="true" tabindex="-1"></a> img <span class="op">=</span> format_image(SAMPLE_IMG_DIR <span class="op">/</span> img)</span> | |
<span id="cb10-3"><a href="#cb10-3" aria-hidden="true" tabindex="-1"></a></span> | |
<span id="cb10-4"><a href="#cb10-4" aria-hidden="true" tabindex="-1"></a> display(desc, img.resize((<span class="bu">int</span>(img.size[<span class="dv">0</span>] <span class="op">*</span> <span class="fl">0.5</span>), <span class="bu">int</span>(img.size[<span class="dv">1</span>] <span class="op">*</span> <span class="fl">0.5</span>))))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div> | |
<div class="cell-output cell-output-stdout"> | |
<pre><code>CPU times: user 11.4 s, sys: 7.42 ms, total: 11.4 s | |
Wall time: 1.19 s | |
CPU times: user 13.5 s, sys: 7.5 ms, total: 13.5 s | |
Wall time: 1.36 s | |
CPU times: user 12 s, sys: 0 ns, total: 12 s | |
Wall time: 1.21 s | |
CPU times: user 12.5 s, sys: 0 ns, total: 12.5 s | |
Wall time: 1.27 s | |
CPU times: user 9.25 s, sys: 7.71 ms, total: 9.25 s | |
Wall time: 936 ms | |
CPU times: user 15.7 s, sys: 7.66 ms, total: 15.7 s | |
Wall time: 1.58 s</code></pre> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<pre><code>'a refrigerator with food inside'</code></pre> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-10-output-3.png" class="img-fluid"></p> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<pre><code>'a table with a variety of fruits and vegetables'</code></pre> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-10-output-5.png" class="img-fluid"></p> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<pre><code>'a refrigerator filled with food and drinks'</code></pre> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-10-output-7.png" class="img-fluid"></p> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<pre><code>'a counter with various foods on it'</code></pre> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-10-output-9.png" class="img-fluid"></p> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<pre><code>'a wooden table'</code></pre> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-10-output-11.png" class="img-fluid"></p> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<pre><code>'a table with a variety of fruits and vegetables'</code></pre> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-10-output-13.png" class="img-fluid"></p> | |
</div> | |
</div> | |
<div class="cell"> | |
<div class="sourceCode cell-code" id="cb18"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb18-1"><a href="#cb18-1" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> img <span class="kw">in</span> sample_images:</span> | |
<span id="cb18-2"><a href="#cb18-2" aria-hidden="true" tabindex="-1"></a> img <span class="op">=</span> format_image(SAMPLE_IMG_DIR <span class="op">/</span> img)</span> | |
<span id="cb18-3"><a href="#cb18-3" aria-hidden="true" tabindex="-1"></a> desc <span class="op">=</span> img_cap.inference(img)</span> | |
<span id="cb18-4"><a href="#cb18-4" aria-hidden="true" tabindex="-1"></a></span> | |
<span id="cb18-5"><a href="#cb18-5" aria-hidden="true" tabindex="-1"></a> answer <span class="op">+=</span> <span class="st">"</span><span class="ch">\n</span><span class="st">"</span> <span class="op">+</span> vqa.inference(</span> | |
<span id="cb18-6"><a href="#cb18-6" aria-hidden="true" tabindex="-1"></a> img, <span class="ss">f"What are three of the fruits seen in the image if any?"</span></span> | |
<span id="cb18-7"><a href="#cb18-7" aria-hidden="true" tabindex="-1"></a> )</span> | |
<span id="cb18-8"><a href="#cb18-8" aria-hidden="true" tabindex="-1"></a> answer <span class="op">+=</span> <span class="st">"</span><span class="ch">\n</span><span class="st">"</span> <span class="op">+</span> vqa.inference(</span> | |
<span id="cb18-9"><a href="#cb18-9" aria-hidden="true" tabindex="-1"></a> img, <span class="ss">f"What grains and starches are in the image if any?"</span></span> | |
<span id="cb18-10"><a href="#cb18-10" aria-hidden="true" tabindex="-1"></a> )</span> | |
<span id="cb18-11"><a href="#cb18-11" aria-hidden="true" tabindex="-1"></a> answer <span class="op">+=</span> <span class="st">"</span><span class="ch">\n</span><span class="st">"</span> <span class="op">+</span> vqa.inference(img, <span class="ss">f"Is there plant-based milk in the image?"</span>)</span> | |
<span id="cb18-12"><a href="#cb18-12" aria-hidden="true" tabindex="-1"></a> <span class="bu">print</span>(</span> | |
<span id="cb18-13"><a href="#cb18-13" aria-hidden="true" tabindex="-1"></a> <span class="ss">f"""</span><span class="sc">{</span>desc<span class="sc">}</span></span> | |
<span id="cb18-14"><a href="#cb18-14" aria-hidden="true" tabindex="-1"></a><span class="sc">{</span>answer<span class="sc">}</span><span class="ss">"""</span></span> | |
<span id="cb18-15"><a href="#cb18-15" aria-hidden="true" tabindex="-1"></a> )</span> | |
<span id="cb18-16"><a href="#cb18-16" aria-hidden="true" tabindex="-1"></a> display(img.resize((<span class="bu">int</span>(img.size[<span class="dv">0</span>] <span class="op">*</span> <span class="fl">0.75</span>), <span class="bu">int</span>(img.size[<span class="dv">1</span>] <span class="op">*</span> <span class="fl">0.75</span>))))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div> | |
<div class="cell-output cell-output-stdout"> | |
<pre><code>CPU times: user 7.67 s, sys: 12.1 ms, total: 7.68 s | |
Wall time: 779 ms | |
a refrigerator with food inside | |
cabbage lettuce onion | |
apples | |
rice | |
yes | |
CPU times: user 10.5 s, sys: 8.13 ms, total: 10.5 s | |
Wall time: 1.06 s | |
a table with a variety of fruits and vegetables | |
broccoli and tomatoes | |
bananas apples oranges | |
potatoes | |
yes | |
CPU times: user 11.7 s, sys: 0 ns, total: 11.7 s | |
Wall time: 1.18 s | |
a refrigerator filled with food and drinks | |
broccoli and zucchini | |
bananas | |
rice | |
yes | |
CPU times: user 11.5 s, sys: 12.2 ms, total: 11.5 s | |
Wall time: 1.16 s | |
a counter with various foods on it | |
carrots and broccoli | |
apples bananas and tomatoes | |
rice | |
yes | |
CPU times: user 9.62 s, sys: 4.22 ms, total: 9.63 s | |
Wall time: 973 ms | |
a wooden table | |
potatoes and carrots | |
apples | |
potatoes | |
yes | |
CPU times: user 11.1 s, sys: 8.23 ms, total: 11.1 s | |
Wall time: 1.12 s | |
a table with a variety of fruits and vegetables | |
peppers broccoli and squash | |
watermelon limes and pineapple | |
rice | |
no</code></pre> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-11-output-2.png" class="img-fluid"></p> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-11-output-3.png" class="img-fluid"></p> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-11-output-4.png" class="img-fluid"></p> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-11-output-5.png" class="img-fluid"></p> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-11-output-6.png" class="img-fluid"></p> | |
</div> | |
<div class="cell-output cell-output-display"> | |
<p><img src="03_ingredient_vision_files/figure-html/cell-11-output-7.png" class="img-fluid"></p> | |
</div> | |
</div> | |
<hr> | |
<p><a href="" target="_blank" style="float:right; font-size:smaller">source</a></p> | |
</section> | |
<section id="veganingredientfinder" class="level3"> | |
<h3 class="anchored" data-anchor-id="veganingredientfinder">VeganIngredientFinder</h3> | |
<blockquote class="blockquote"> | |
<pre><code> VeganIngredientFinder ()</code></pre> | |
</blockquote> | |
<p>Initialize self. See help(type(self)) for accurate signature.</p> | |
<hr> | |
<p><a href="" target="_blank" style="float:right; font-size:smaller">source</a></p> | |
</section> | |
<section id="veganingredientfinder.list_ingredients" class="level3"> | |
<h3 class="anchored" data-anchor-id="veganingredientfinder.list_ingredients">VeganIngredientFinder.list_ingredients</h3> | |
<blockquote class="blockquote"> | |
<pre><code> VeganIngredientFinder.list_ingredients (img:str)</code></pre> | |
</blockquote> | |
<table class="table"> | |
<thead> | |
<tr class="header"> | |
<th></th> | |
<th><strong>Type</strong></th> | |
<th><strong>Details</strong></th> | |
</tr> | |
</thead> | |
<tbody> | |
<tr class="odd"> | |
<td>img</td> | |
<td>str</td> | |
<td>Image file path</td> | |
</tr> | |
<tr class="even"> | |
<td><strong>Returns</strong></td> | |
<td><strong>str</strong></td> | |
<td></td> | |
</tr> | |
</tbody> | |
</table> | |
<div class="cell"> | |
<div class="sourceCode cell-code" id="cb22"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb22-1"><a href="#cb22-1" aria-hidden="true" tabindex="-1"></a>vegan_ingred_finder <span class="op">=</span> VeganIngredientFinder()</span> | |
<span id="cb22-2"><a href="#cb22-2" aria-hidden="true" tabindex="-1"></a>vegan_ingred_finder.list_ingredients(SAMPLE_IMG_DIR <span class="op">/</span> sample_images[<span class="dv">0</span>])</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div> | |
<div class="cell-output cell-output-display"> | |
<pre><code>'cabbage lettuce onion\napples\nrice\nplant-based milk'</code></pre> | |
</div> | |
</div> | |
</section> | |
</main> <!-- /main --> | |
