ahassoun's picture
Upload 3018 files
ee6e328
|
raw
history blame
36.2 kB

์‚ฌ์šฉ์ž ์ •์˜ ๋„๊ตฌ์™€ ํ”„๋กฌํ”„ํŠธ[[custom-tools-and-prompts]]

Transformers์™€ ๊ด€๋ จํ•˜์—ฌ ์–ด๋–ค ๋„๊ตฌ์™€ ์—์ด์ „ํŠธ๊ฐ€ ์žˆ๋Š”์ง€ ์ž˜ ๋ชจ๋ฅด์‹ ๋‹ค๋ฉด Transformers Agents ํŽ˜์ด์ง€๋ฅผ ๋จผ์ € ์ฝ์–ด๋ณด์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.

Transformers Agents๋Š” ์‹คํ—˜ ์ค‘์ธ API๋กœ ์–ธ์ œ๋“ ์ง€ ๋ณ€๊ฒฝ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. API ๋˜๋Š” ๊ธฐ๋ฐ˜ ๋ชจ๋ธ์ด ๋ณ€๊ฒฝ๋˜๊ธฐ ์‰ฝ๊ธฐ ๋•Œ๋ฌธ์— ์—์ด์ „ํŠธ๊ฐ€ ๋ฐ˜ํ™˜ํ•˜๋Š” ๊ฒฐ๊ณผ๋„ ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์—์ด์ „ํŠธ์—๊ฒŒ ๊ถŒํ•œ์„ ๋ถ€์—ฌํ•˜๊ณ  ์ƒˆ๋กœ์šด ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๊ฒŒ ํ•˜๋ ค๋ฉด ์‚ฌ์šฉ์ž ์ •์˜ ๋„๊ตฌ์™€ ํ”„๋กฌํ”„ํŠธ๋ฅผ ๋งŒ๋“ค๊ณ  ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ๋ฌด์—‡๋ณด๋‹ค ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ฐ€์ด๋“œ์—์„œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋‚ด์šฉ์„ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค:

  • ํ”„๋กฌํ”„ํŠธ๋ฅผ ์‚ฌ์šฉ์ž ์ •์˜ํ•˜๋Š” ๋ฐฉ๋ฒ•
  • ์‚ฌ์šฉ์ž ์ •์˜ ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•
  • ์‚ฌ์šฉ์ž ์ •์˜ ๋„๊ตฌ๋ฅผ ๋งŒ๋“œ๋Š” ๋ฐฉ๋ฒ•

ํ”„๋กฌํ”„ํŠธ๋ฅผ ์‚ฌ์šฉ์ž ์ •์˜ํ•˜๊ธฐ[[customizing-the-prompt]]

Transformers Agents์—์„œ ์„ค๋ช…ํ•œ ๊ฒƒ์ฒ˜๋Ÿผ ์—์ด์ „ํŠธ๋Š” [~Agent.run] ๋ฐ [~Agent.chat] ๋ชจ๋“œ์—์„œ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. run(์‹คํ–‰) ๋ชจ๋“œ์™€ chat(์ฑ„ํŒ…) ๋ชจ๋“œ ๋ชจ๋‘ ๋™์ผํ•œ ๋กœ์ง์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•ฉ๋‹ˆ๋‹ค. ์—์ด์ „ํŠธ๋ฅผ ๊ตฌ๋™ํ•˜๋Š” ์–ธ์–ด ๋ชจ๋ธ์€ ๊ธด ํ”„๋กฌํ”„ํŠธ์— ๋”ฐ๋ผ ์กฐ๊ฑด์ด ์ง€์ •๋˜๊ณ , ์ค‘์ง€ ํ† ํฐ์— ๋„๋‹ฌํ•  ๋•Œ๊นŒ์ง€ ๋‹ค์Œ ํ† ํฐ์„ ์ƒ์„ฑํ•˜์—ฌ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์™„์ˆ˜ํ•ฉ๋‹ˆ๋‹ค. chat ๋ชจ๋“œ์—์„œ๋Š” ํ”„๋กฌํ”„ํŠธ๊ฐ€ ์ด์ „ ์‚ฌ์šฉ์ž ์ž…๋ ฅ ๋ฐ ๋ชจ๋ธ ์ƒ์„ฑ์œผ๋กœ ์—ฐ์žฅ๋œ๋‹ค๋Š” ์ ์ด ๋‘ ๋ชจ๋“œ์˜ ์œ ์ผํ•œ ์ฐจ์ด์ ์ž…๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ์—์ด์ „ํŠธ๊ฐ€ ๊ณผ๊ฑฐ ์ƒํ˜ธ์ž‘์šฉ์— ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜๋ฏ€๋กœ ์—์ด์ „ํŠธ์—๊ฒŒ ์ผ์ข…์˜ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ œ๊ณตํ•˜๋Š” ์…ˆ์ž…๋‹ˆ๋‹ค.

ํ”„๋กฌํ”„ํŠธ์˜ ๊ตฌ์กฐ[[structure-of-the-prompt]]

์–ด๋–ป๊ฒŒ ํ”„๋กฌํ”„ํŠธ ์‚ฌ์šฉ์ž ์ •์˜๋ฅผ ์ž˜ ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์ดํ•ดํ•˜๊ธฐ ์œ„ํ•ด ํ”„๋กฌํ”„ํŠธ์˜ ๊ตฌ์กฐ๋ฅผ ์ž์„ธํžˆ ์‚ดํŽด๋ด…์‹œ๋‹ค. ํ”„๋กฌํ”„ํŠธ๋Š” ํฌ๊ฒŒ ๋„ค ๋ถ€๋ถ„์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

    1. ๋„์ž…: ์—์ด์ „ํŠธ๊ฐ€ ์–ด๋–ป๊ฒŒ ํ–‰๋™ํ•ด์•ผ ํ•˜๋Š”์ง€, ๋„๊ตฌ์˜ ๊ฐœ๋…์— ๋Œ€ํ•œ ์„ค๋ช….
    1. ๋ชจ๋“  ๋„๊ตฌ์— ๋Œ€ํ•œ ์„ค๋ช…. ์ด๋Š” ๋Ÿฐํƒ€์ž„์— ์‚ฌ์šฉ์ž๊ฐ€ ์ •์˜/์„ ํƒํ•œ ๋„๊ตฌ๋กœ ๋™์ ์œผ๋กœ ๋Œ€์ฒด๋˜๋Š” <<all_tools>> ํ† ํฐ์œผ๋กœ ์ •์˜๋ฉ๋‹ˆ๋‹ค.
    1. ์ž‘์—… ์˜ˆ์ œ ๋ฐ ํ•ด๋‹น ์†”๋ฃจ์…˜ ์„ธํŠธ.
    1. ํ˜„์žฌ ์˜ˆ์ œ ๋ฐ ํ•ด๊ฒฐ ์š”์ฒญ.

๊ฐ ๋ถ€๋ถ„์„ ๋” ์ž˜ ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋„๋ก ์งง์€ ๋ฒ„์ „์„ ํ†ตํ•ด run ํ”„๋กฌํ”„ํŠธ๊ฐ€ ์–ด๋–ป๊ฒŒ ๋ณด์ด๋Š”์ง€ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค:

I will ask you to perform a task, your job is to come up with a series of simple commands in Python that will perform the task.
[...]
You can print intermediate results if it makes sense to do so.

Tools:
- document_qa: This is a tool that answers a question about a document (pdf). It takes an input named `document` which should be the document containing the information, as well as a `question` that is the question about the document. It returns a text that contains the answer to the question.
- image_captioner: This is a tool that generates a description of an image. It takes an input named `image` which should be the image to the caption and returns a text that contains the description in English.
[...]

Task: "Answer the question in the variable `question` about the image stored in the variable `image`. The question is in French."

I will use the following tools: `translator` to translate the question into English and then `image_qa` to answer the question on the input image.

Answer:
```py
translated_question = translator(question=question, src_lang="French", tgt_lang="English")
print(f"The translated question is {translated_question}.")
answer = image_qa(image=image, question=translated_question)
print(f"The answer is {answer}")
```

Task: "Identify the oldest person in the `document` and create an image showcasing the result as a banner."

I will use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer.

Answer:
```py
answer = document_qa(document, question="What is the oldest person?")
print(f"The answer is {answer}.")
image = image_generator("A banner showing " + answer)
```

[...]

Task: "Draw me a picture of rivers and lakes"

I will use the following

๋„์ž…("๋„๊ตฌ:" ์•ž์˜ ํ…์ŠคํŠธ)์—์„œ๋Š” ๋ชจ๋ธ์ด ์–ด๋–ป๊ฒŒ ์ž‘๋™ํ•˜๊ณ  ๋ฌด์—‡์„ ํ•ด์•ผ ํ•˜๋Š”์ง€ ์ •ํ™•ํ•˜๊ฒŒ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. ์—์ด์ „ํŠธ๋Š” ํ•ญ์ƒ ๊ฐ™์€ ๋ฐฉ์‹์œผ๋กœ ์ž‘๋™ํ•ด์•ผ ํ•˜๋ฏ€๋กœ ์ด ๋ถ€๋ถ„์€ ์‚ฌ์šฉ์ž ์ •์˜ํ•  ํ•„์š”๊ฐ€ ์—†์„ ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์Šต๋‹ˆ๋‹ค.

๋‘ ๋ฒˆ์งธ ๋ถ€๋ถ„("๋„๊ตฌ" ์•„๋ž˜์˜ ๊ธ€๋จธ๋ฆฌ ๊ธฐํ˜ธ)์€ run ๋˜๋Š” chat์„ ํ˜ธ์ถœํ•  ๋•Œ ๋™์ ์œผ๋กœ ์ถ”๊ฐ€๋ฉ๋‹ˆ๋‹ค. ์ •ํ™•ํžˆ agent.toolbox์— ์žˆ๋Š” ๋„๊ตฌ ์ˆ˜๋งŒํผ ๊ธ€๋จธ๋ฆฌ ๊ธฐํ˜ธ๊ฐ€ ์žˆ๊ณ , ๊ฐ ๊ธ€๋จธ๋ฆฌ ๊ธฐํ˜ธ๋Š” ๋„๊ตฌ์˜ ์ด๋ฆ„๊ณผ ์„ค๋ช…์œผ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค:

- <tool.name>: <tool.description>

๋ฌธ์„œ ์งˆ์˜์‘๋‹ต ๋„๊ตฌ๋ฅผ ๊ฐ€์ ธ์˜ค๊ณ  ์ด๋ฆ„๊ณผ ์„ค๋ช…์„ ์ถœ๋ ฅํ•ด์„œ ๋น ๋ฅด๊ฒŒ ํ™•์ธํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

from transformers import load_tool

document_qa = load_tool("document-question-answering")
print(f"- {document_qa.name}: {document_qa.description}")

๊ทธ๋Ÿฌ๋ฉด ๋‹ค์Œ ๊ฒฐ๊ณผ๊ฐ€ ์ถœ๋ ฅ๋ฉ๋‹ˆ๋‹ค:

- document_qa: This is a tool that answers a question about a document (pdf). It takes an input named `document` which should be the document containing the information, as well as a `question` that is the question about the document. It returns a text that contains the answer to the question.

์—ฌ๊ธฐ์„œ ๋„๊ตฌ ์ด๋ฆ„์ด ์งง๊ณ  ์ •ํ™•ํ•˜๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์„ค๋ช…์€ ๋‘ ๋ถ€๋ถ„์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋Š”๋ฐ, ์ฒซ ๋ฒˆ์งธ ๋ถ€๋ถ„์—์„œ๋Š” ๋„๊ตฌ์˜ ๊ธฐ๋Šฅ์„ ์„ค๋ช…ํ•˜๊ณ  ๋‘ ๋ฒˆ์งธ ๋ถ€๋ถ„์—์„œ๋Š” ์˜ˆ์ƒ๋˜๋Š” ์ž…๋ ฅ ์ธ์ˆ˜์™€ ๋ฐ˜ํ™˜ ๊ฐ’์„ ๋ช…์‹œํ•ฉ๋‹ˆ๋‹ค.

์—์ด์ „ํŠธ๊ฐ€ ๋„๊ตฌ๋ฅผ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ์ข‹์€ ๋„๊ตฌ ์ด๋ฆ„๊ณผ ๋„๊ตฌ ์„ค๋ช…์ด ๋งค์šฐ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ์—์ด์ „ํŠธ๊ฐ€ ๋„๊ตฌ์— ๋Œ€ํ•ด ์•Œ ์ˆ˜ ์žˆ๋Š” ์œ ์ผํ•œ ์ •๋ณด๋Š” ์ด๋ฆ„๊ณผ ์„ค๋ช…๋ฟ์ด๋ฏ€๋กœ, ์ด ๋‘ ๊ฐ€์ง€๋ฅผ ์ •ํ™•ํ•˜๊ฒŒ ์ž‘์„ฑํ•˜๊ณ  ๋„๊ตฌ ์ƒ์ž์— ์žˆ๋Š” ๊ธฐ์กด ๋„๊ตฌ์˜ ์Šคํƒ€์ผ๊ณผ ์ผ์น˜ํ•˜๋Š”์ง€ ํ™•์ธํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ํŠนํžˆ ์ด๋ฆ„์— ๋”ฐ๋ผ ์˜ˆ์ƒ๋˜๋Š” ๋ชจ๋“  ์ธ์ˆ˜๊ฐ€ ์„ค๋ช…์— ์ฝ”๋“œ ์Šคํƒ€์ผ๋กœ ์–ธ๊ธ‰๋˜์–ด ์žˆ๋Š”์ง€, ์˜ˆ์ƒ๋˜๋Š” ์œ ํ˜•๊ณผ ๊ทธ ์œ ํ˜•์ด ๋ฌด์—‡์ธ์ง€์— ๋Œ€ํ•œ ์„ค๋ช…์ด ํฌํ•จ๋˜์–ด ์žˆ๋Š”์ง€ ํ™•์ธํ•˜์„ธ์š”.

๋„๊ตฌ์— ์–ด๋–ค ์ด๋ฆ„๊ณผ ์„ค๋ช…์ด ์žˆ์–ด์•ผ ํ•˜๋Š”์ง€ ์ดํ•ดํ•˜๋ ค๋ฉด ์—„์„ ๋œ Transformers ๋„๊ตฌ์˜ ์ด๋ฆ„๊ณผ ์„ค๋ช…์„ ํ™•์ธํ•˜์„ธ์š”. [Agent.toolbox] ์†์„ฑ์„ ๊ฐ€์ง„ ๋ชจ๋“  ๋„๊ตฌ๋ฅผ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์„ธ ๋ฒˆ์งธ ๋ถ€๋ถ„์—๋Š” ์—์ด์ „ํŠธ๊ฐ€ ์–ด๋–ค ์ข…๋ฅ˜์˜ ์‚ฌ์šฉ์ž ์š”์ฒญ์— ๋Œ€ํ•ด ์–ด๋–ค ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•ด์•ผ ํ•˜๋Š”์ง€ ์ •ํ™•ํ•˜๊ฒŒ ๋ณด์—ฌ์ฃผ๋Š” ์—„์„ ๋œ ์˜ˆ์ œ ์„ธํŠธ๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ์—์ด์ „ํŠธ๋ฅผ ์ง€์›ํ•˜๋Š” ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ์€ ํ”„๋กฌํ”„ํŠธ์—์„œ ํŒจํ„ด์„ ์ธ์‹ํ•˜๊ณ  ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ๋กœ ํŒจํ„ด์„ ๋ฐ˜๋ณตํ•˜๋Š” ๋ฐ ๋งค์šฐ ๋Šฅ์ˆ™ํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์—์ด์ „ํŠธ๊ฐ€ ์‹ค์ œ๋กœ ์˜ฌ๋ฐ”๋ฅธ ์‹คํ–‰ ๊ฐ€๋Šฅํ•œ ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•  ๊ฐ€๋Šฅ์„ฑ์„ ๊ทน๋Œ€ํ™”ํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์˜ˆ์ œ๋ฅผ ์ž‘์„ฑํ•˜๋Š” ๊ฒƒ์ด ๋งค์šฐ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.

ํ•œ ๊ฐ€์ง€ ์˜ˆ๋ฅผ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค:

Task: "Identify the oldest person in the `document` and create an image showcasing the result as a banner."

I will use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer.

Answer:
```py
answer = document_qa(document, question="What is the oldest person?")
print(f"The answer is {answer}.")
image = image_generator("A banner showing " + answer)
```

์ž‘์—… ์„ค๋ช…, ์—์ด์ „ํŠธ๊ฐ€ ์ˆ˜ํ–‰ํ•˜๋ ค๋Š” ์ž‘์—…์— ๋Œ€ํ•œ ์„ค๋ช…, ๋งˆ์ง€๋ง‰์œผ๋กœ ์ƒ์„ฑ๋œ ์ฝ”๋“œ, ์ด ์„ธ ๋ถ€๋ถ„์œผ๋กœ ๊ตฌ์„ฑ๋œ ํ”„๋กฌํ”„ํŠธ๋Š” ๋ชจ๋ธ์— ๋ฐ˜๋ณตํ•˜์—ฌ ์ œ๊ณต๋ฉ๋‹ˆ๋‹ค. ํ”„๋กฌํ”„ํŠธ์˜ ์ผ๋ถ€์ธ ๋ชจ๋“  ์˜ˆ์ œ๋Š” ์ด๋Ÿฌํ•œ ์ •ํ™•ํ•œ ํŒจํ„ด์œผ๋กœ ๋˜์–ด ์žˆ์œผ๋ฏ€๋กœ, ์—์ด์ „ํŠธ๊ฐ€ ์ƒˆ ํ† ํฐ์„ ์ƒ์„ฑํ•  ๋•Œ ์ •ํ™•ํžˆ ๋™์ผํ•œ ํŒจํ„ด์„ ์žฌํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํ”„๋กฌํ”„ํŠธ ์˜ˆ์ œ๋Š” Transformers ํŒ€์ด ์„ ๋ณ„ํ•˜๊ณ  ์ผ๋ จ์˜ problem statements์— ๋”ฐ๋ผ ์—„๊ฒฉํ•˜๊ฒŒ ํ‰๊ฐ€ํ•˜์—ฌ ์—์ด์ „ํŠธ์˜ ํ”„๋กฌํ”„ํŠธ๊ฐ€ ์—์ด์ „ํŠธ์˜ ์‹ค์ œ ์‚ฌ์šฉ ์‚ฌ๋ก€๋ฅผ ์ตœ๋Œ€ํ•œ ์ž˜ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋„๋ก ๋ณด์žฅํ•ฉ๋‹ˆ๋‹ค.

ํ”„๋กฌํ”„ํŠธ์˜ ๋งˆ์ง€๋ง‰ ๋ถ€๋ถ„์€ ๋‹ค์Œ์— ํ•ด๋‹นํ•ฉ๋‹ˆ๋‹ค:

Task: "Draw me a picture of rivers and lakes"

I will use the following

์ด๋Š” ์—์ด์ „ํŠธ๊ฐ€ ์™„๋ฃŒํ•ด์•ผ ํ•  ์ตœ์ข…์ ์ธ ๋ฏธ์™„์„ฑ ์˜ˆ์ œ์ž…๋‹ˆ๋‹ค. ๋ฏธ์™„์„ฑ ์˜ˆ์ œ๋Š” ์‹ค์ œ ์‚ฌ์šฉ์ž ์ž…๋ ฅ์— ๋”ฐ๋ผ ๋™์ ์œผ๋กœ ๋งŒ๋“ค์–ด์ง‘๋‹ˆ๋‹ค. ์œ„ ์˜ˆ์‹œ์˜ ๊ฒฝ์šฐ ์‚ฌ์šฉ์ž๊ฐ€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์‹คํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค:

agent.run("Draw me a picture of rivers and lakes")

์‚ฌ์šฉ์ž ์ž…๋ ฅ - ์ฆ‰ Task: *"Draw me a picture of rivers and lakes"*๊ฐ€ ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ์— ๋งž์ถฐ "Task: \n\n I will use the following"๋กœ ์บ์ŠคํŒ…๋ฉ๋‹ˆ๋‹ค. ์ด ๋ฌธ์žฅ์€ ์—์ด์ „ํŠธ์—๊ฒŒ ์กฐ๊ฑด์ด ์ ์šฉ๋˜๋Š” ํ”„๋กฌํ”„ํŠธ์˜ ๋งˆ์ง€๋ง‰ ์ค„์„ ๊ตฌ์„ฑํ•˜๋ฏ€๋กœ ์—์ด์ „ํŠธ๊ฐ€ ์ด์ „ ์˜ˆ์ œ์—์„œ ์ˆ˜ํ–‰ํ•œ ๊ฒƒ๊ณผ ์ •ํ™•ํžˆ ๋™์ผํ•œ ๋ฐฉ์‹์œผ๋กœ ์˜ˆ์ œ๋ฅผ ์™„๋ฃŒํ•˜๋„๋ก ๊ฐ•๋ ฅํ•˜๊ฒŒ ์˜ํ–ฅ์„ ๋ฏธ์นฉ๋‹ˆ๋‹ค.

๋„ˆ๋ฌด ์ž์„ธํžˆ ์„ค๋ช…ํ•˜์ง€ ์•Š๋”๋ผ๋„ ์ฑ„ํŒ… ํ…œํ”Œ๋ฆฟ์˜ ํ”„๋กฌํ”„ํŠธ ๊ตฌ์กฐ๋Š” ๋™์ผํ•˜์ง€๋งŒ ์˜ˆ์ œ์˜ ์Šคํƒ€์ผ์ด ์•ฝ๊ฐ„ ๋‹ค๋ฆ…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค๋ฉด:

[...]

=====

Human: Answer the question in the variable `question` about the image stored in the variable `image`.

Assistant: I will use the tool `image_qa` to answer the question on the input image.

```py
answer = image_qa(text=question, image=image)
print(f"The answer is {answer}")
```

Human: I tried this code, it worked but didn't give me a good result. The question is in French

Assistant: In this case, the question needs to be translated first. I will use the tool `translator` to do this.

```py
translated_question = translator(question=question, src_lang="French", tgt_lang="English")
print(f"The translated question is {translated_question}.")
answer = image_qa(text=translated_question, image=image)
print(f"The answer is {answer}")
```

=====

[...]

run ํ”„๋กฌํ”„ํŠธ์˜ ์˜ˆ์™€๋Š” ๋ฐ˜๋Œ€๋กœ, ๊ฐ chat ํ”„๋กฌํ”„ํŠธ์˜ ์˜ˆ์—๋Š” *Human(์‚ฌ๋žŒ)*๊ณผ Assistant(์–ด์‹œ์Šคํ„ดํŠธ) ๊ฐ„์— ํ•˜๋‚˜ ์ด์ƒ์˜ ๊ตํ™˜์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋ชจ๋“  ๊ตํ™˜์€ run ํ”„๋กฌํ”„ํŠธ์˜ ์˜ˆ์™€ ์œ ์‚ฌํ•œ ๊ตฌ์กฐ๋กœ ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ž์˜ ์ž…๋ ฅ์ด Human: ๋’ค์— ์ถ”๊ฐ€๋˜๋ฉฐ, ์—์ด์ „ํŠธ์—๊ฒŒ ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•˜๊ธฐ ์ „์— ์ˆ˜ํ–‰ํ•ด์•ผ ํ•  ์ž‘์—…์„ ๋จผ์ € ์ƒ์„ฑํ•˜๋ผ๋Š” ๋ฉ”์‹œ์ง€๊ฐ€ ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค. ๊ตํ™˜์€ ์ด์ „ ๊ตํ™˜์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์œ„์™€ ๊ฐ™์ด ์‚ฌ์šฉ์ž๊ฐ€ "์ด ์ฝ”๋“œ๋ฅผ ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค"๋ผ๊ณ  ์ž…๋ ฅํ•˜๋ฉด ์ด์ „์— ์ƒ์„ฑ๋œ ์—์ด์ „ํŠธ์˜ ์ฝ”๋“œ๋ฅผ ์ฐธ์กฐํ•˜์—ฌ ๊ณผ๊ฑฐ ๊ตํ™˜์„ ์ฐธ์กฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

.chat์„ ์‹คํ–‰ํ•˜๋ฉด ์‚ฌ์šฉ์ž์˜ ์ž…๋ ฅ ๋˜๋Š” ์ž‘์—…์ด ๋ฏธ์™„์„ฑ๋œ ์–‘์‹์˜ ์˜ˆ์‹œ๋กœ ์บ์ŠคํŒ…๋ฉ๋‹ˆ๋‹ค:

Human: <user-input>\n\nAssistant:

๊ทธ๋Ÿฌ๋ฉด ์—์ด์ „ํŠธ๊ฐ€ ์ด๋ฅผ ์™„์„ฑํ•ฉ๋‹ˆ๋‹ค. run ๋ช…๋ น๊ณผ ๋‹ฌ๋ฆฌ chat ๋ช…๋ น์€ ์™„๋ฃŒ๋œ ์˜ˆ์ œ๋ฅผ ํ”„๋กฌํ”„ํŠธ์— ์ถ”๊ฐ€ํ•˜์—ฌ ์—์ด์ „ํŠธ์—๊ฒŒ ๋‹ค์Œ chat ์ฐจ๋ก€์— ๋Œ€ํ•œ ๋” ๋งŽ์€ ๋ฌธ๋งฅ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

์ด์ œ ํ”„๋กฌํ”„ํŠธ๊ฐ€ ์–ด๋–ป๊ฒŒ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋Š”์ง€ ์•Œ์•˜์œผ๋‹ˆ ์–ด๋–ป๊ฒŒ ์‚ฌ์šฉ์ž ์ •์˜ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์‚ดํŽด๋ด…์‹œ๋‹ค!

์ข‹์€ ์‚ฌ์šฉ์ž ์ž…๋ ฅ ์ž‘์„ฑํ•˜๊ธฐ[[writing-good-user-inputs]]

๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ์ด ์‚ฌ์šฉ์ž์˜ ์˜๋„๋ฅผ ์ดํ•ดํ•˜๋Š” ๋Šฅ๋ ฅ์ด ์ ์  ๋” ํ–ฅ์ƒ๋˜๊ณ  ์žˆ์ง€๋งŒ, ์—์ด์ „ํŠธ๊ฐ€ ์˜ฌ๋ฐ”๋ฅธ ์ž‘์—…์„ ์„ ํƒํ•  ์ˆ˜ ์žˆ๋„๋ก ์ตœ๋Œ€ํ•œ ์ •ํ™•์„ฑ์„ ์œ ์ง€ํ•˜๋Š” ๊ฒƒ์€ ํฐ ๋„์›€์ด ๋ฉ๋‹ˆ๋‹ค. ์ตœ๋Œ€ํ•œ ์ •ํ™•ํ•˜๋‹ค๋Š” ๊ฒƒ์€ ๋ฌด์—‡์„ ์˜๋ฏธํ• ๊นŒ์š”?

์—์ด์ „ํŠธ๋Š” ํ”„๋กฌํ”„ํŠธ์—์„œ ๋„๊ตฌ ์ด๋ฆ„ ๋ชฉ๋ก๊ณผ ํ•ด๋‹น ์„ค๋ช…์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋” ๋งŽ์€ ๋„๊ตฌ๊ฐ€ ์ถ”๊ฐ€๋ ์ˆ˜๋ก ์—์ด์ „ํŠธ๊ฐ€ ์˜ฌ๋ฐ”๋ฅธ ๋„๊ตฌ๋ฅผ ์„ ํƒํ•˜๊ธฐ๊ฐ€ ๋” ์–ด๋ ค์›Œ์ง€๊ณ  ์‹คํ–‰ํ•  ๋„๊ตฌ์˜ ์˜ฌ๋ฐ”๋ฅธ ์ˆœ์„œ๋ฅผ ์„ ํƒํ•˜๋Š” ๊ฒƒ์€ ๋”์šฑ ์–ด๋ ค์›Œ์ง‘๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์ธ ์‹คํŒจ ์‚ฌ๋ก€๋ฅผ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ๋Š” ๋ถ„์„ํ•  ์ฝ”๋“œ๋งŒ ๋ฐ˜ํ™˜ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

from transformers import HfAgent

agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder")

agent.run("Show me a tree", return_code=True)

๊ทธ๋Ÿฌ๋ฉด ๋‹ค์Œ ๊ฒฐ๊ณผ๊ฐ€ ์ถœ๋ ฅ๋ฉ๋‹ˆ๋‹ค:

==Explanation from the agent==
I will use the following tool: `image_segmenter` to create a segmentation mask for the image.


==Code generated by the agent==
mask = image_segmenter(image, prompt="tree")

์šฐ๋ฆฌ๊ฐ€ ์›ํ–ˆ๋˜ ๊ฒฐ๊ณผ๊ฐ€ ์•„๋‹ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ๋Œ€์‹  ๋‚˜๋ฌด ์ด๋ฏธ์ง€๊ฐ€ ์ƒ์„ฑ๋˜๊ธฐ๋ฅผ ์›ํ•  ๊ฐ€๋Šฅ์„ฑ์ด ๋” ๋†’์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์—์ด์ „ํŠธ๊ฐ€ ํŠน์ • ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•˜๋„๋ก ์œ ๋„ํ•˜๋ ค๋ฉด ๋„๊ตฌ์˜ ์ด๋ฆ„๊ณผ ์„ค๋ช…์— ์žˆ๋Š” ์ค‘์š”ํ•œ ํ‚ค์›Œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ๋งค์šฐ ์œ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•œ๋ฒˆ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

agent.toolbox["image_generator"].description
'This is a tool that creates an image according to a prompt, which is a text description. It takes an input named `prompt` which contains the image description and outputs an image.

์ด๋ฆ„๊ณผ ์„ค๋ช…์€ "image", "prompt", "create" ๋ฐ "generate" ํ‚ค์›Œ๋“œ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ด ๋‹จ์–ด๋“ค์„ ์‚ฌ์šฉํ•˜๋ฉด ๋” ์ž˜ ์ž‘๋™ํ•  ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์Šต๋‹ˆ๋‹ค. ํ”„๋กฌํ”„ํŠธ๋ฅผ ์กฐ๊ธˆ ๋” ๊ตฌ์ฒดํ™”ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

agent.run("Create an image of a tree", return_code=True)

์ด ์ฝ”๋“œ๋Š” ๋‹ค์Œ ํ”„๋กฌํ”„ํŠธ๋ฅผ ๋งŒ๋“ค์–ด๋ƒ…๋‹ˆ๋‹ค:

==Explanation from the agent==
I will use the following tool `image_generator` to generate an image of a tree.


==Code generated by the agent==
image = image_generator(prompt="tree")

ํ›จ์”ฌ ๋‚ซ๋„ค์š”! ์ €ํฌ๊ฐ€ ์›ํ–ˆ๋˜ ๊ฒƒ๊ณผ ๋น„์Šทํ•ด ๋ณด์ž…๋‹ˆ๋‹ค. ์ฆ‰, ์—์ด์ „ํŠธ๊ฐ€ ์ž‘์—…์„ ์˜ฌ๋ฐ”๋ฅธ ๋„๊ตฌ์— ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๋งคํ•‘ํ•˜๋Š” ๋ฐ ์–ด๋ ค์›€์„ ๊ฒช๊ณ  ์žˆ๋‹ค๋ฉด ๋„๊ตฌ ์ด๋ฆ„๊ณผ ์„ค๋ช…์—์„œ ๊ฐ€์žฅ ๊ด€๋ จ์„ฑ์ด ๋†’์€ ํ‚ค์›Œ๋“œ๋ฅผ ์ฐพ์•„๋ณด๊ณ  ์ด๋ฅผ ํ†ตํ•ด ์ž‘์—… ์š”์ฒญ์„ ๊ตฌ์ฒดํ™”ํ•ด ๋ณด์„ธ์š”.

๋„๊ตฌ ์„ค๋ช… ์‚ฌ์šฉ์ž ์ •์˜ํ•˜๊ธฐ[[customizing-the-tool-descriptions]]

์•ž์„œ ์‚ดํŽด๋ณธ ๊ฒƒ์ฒ˜๋Ÿผ ์—์ด์ „ํŠธ๋Š” ๊ฐ ๋„๊ตฌ์˜ ์ด๋ฆ„๊ณผ ์„ค๋ช…์— ์•ก์„ธ์Šคํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ธฐ๋ณธ ๋„๊ตฌ์—๋Š” ๋งค์šฐ ์ •ํ™•ํ•œ ์ด๋ฆ„๊ณผ ์„ค๋ช…์ด ์žˆ์–ด์•ผ ํ•˜์ง€๋งŒ ํŠน์ • ์‚ฌ์šฉ ์‚ฌ๋ก€์— ๋งž๊ฒŒ ๋„๊ตฌ์˜ ์„ค๋ช…์ด๋‚˜ ์ด๋ฆ„์„ ๋ณ€๊ฒฝํ•˜๋Š” ๊ฒƒ์ด ๋„์›€์ด ๋  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ๋งค์šฐ ์œ ์‚ฌํ•œ ์—ฌ๋Ÿฌ ๋„๊ตฌ๋ฅผ ์ถ”๊ฐ€ํ–ˆ๊ฑฐ๋‚˜ ํŠน์ • ๋„๋ฉ”์ธ(์˜ˆ: ์ด๋ฏธ์ง€ ์ƒ์„ฑ ๋ฐ ๋ณ€ํ™˜)์—๋งŒ ์—์ด์ „ํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋Š” ๊ฒฝ์šฐ์— ํŠนํžˆ ์ค‘์š”ํ•ด์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ผ๋ฐ˜์ ์ธ ๋ฌธ์ œ๋Š” ์ด๋ฏธ์ง€ ์ƒ์„ฑ ์ž‘์—…์— ๋งŽ์ด ์‚ฌ์šฉ๋˜๋Š” ๊ฒฝ์šฐ ์—์ด์ „ํŠธ๊ฐ€ ์ด๋ฏธ์ง€ ์ƒ์„ฑ๊ณผ ์ด๋ฏธ์ง€ ๋ณ€ํ™˜/์ˆ˜์ •์„ ํ˜ผ๋™ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด,

agent.run("Make an image of a house and a car", return_code=True)

๊ทธ๋Ÿฌ๋ฉด ๋‹ค์Œ ๊ฒฐ๊ณผ๊ฐ€ ์ถœ๋ ฅ๋ฉ๋‹ˆ๋‹ค:

==Explanation from the agent== 
I will use the following tools `image_generator` to generate an image of a house and `image_transformer` to transform the image of a car into the image of a house.

==Code generated by the agent==
house_image = image_generator(prompt="A house")
car_image = image_generator(prompt="A car")
house_car_image = image_transformer(image=car_image, prompt="A house")

๊ฒฐ๊ณผ๋ฌผ์ด ์šฐ๋ฆฌ๊ฐ€ ์—ฌ๊ธฐ์„œ ์›ํ•˜๋Š” ๊ฒƒ๊ณผ ์ •ํ™•ํžˆ ์ผ์น˜ํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์—์ด์ „ํŠธ๊ฐ€ image_generator์™€ image_transformer์˜ ์ฐจ์ด์ ์„ ์ดํ•ดํ•˜๊ธฐ ์–ด๋ ค์›Œ์„œ ๋‘ ๊ฐ€์ง€๋ฅผ ํ•จ๊ป˜ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์€ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์„œ image_transformer์˜ ๋„๊ตฌ ์ด๋ฆ„๊ณผ ์„ค๋ช…์„ ๋ณ€๊ฒฝํ•˜์—ฌ ์—์ด์ „ํŠธ๊ฐ€ ๋„์šธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. "image" ๋ฐ "prompt"์™€ ์•ฝ๊ฐ„ ๋ถ„๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด modifier๋ผ๊ณ  ๋Œ€์‹  ๋ถ€๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค:

agent.toolbox["modifier"] = agent.toolbox.pop("image_transformer")
agent.toolbox["modifier"].description = agent.toolbox["modifier"].description.replace(
    "transforms an image according to a prompt", "modifies an image"
)

์ด์ œ "modify"์€ ์ƒˆ ์ด๋ฏธ์ง€ ํ”„๋กœ์„ธ์„œ๋ฅผ ์‚ฌ์šฉํ•˜๋ผ๋Š” ๊ฐ•๋ ฅํ•œ ์‹ ํ˜ธ์ด๋ฏ€๋กœ ์œ„์˜ ํ”„๋กฌํ”„ํŠธ์— ๋„์›€์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋‹ค์‹œ ์‹คํ–‰ํ•ด ๋ด…์‹œ๋‹ค.

agent.run("Make an image of a house and a car", return_code=True)

์—ฌ๊ธฐ์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ฒฐ๊ณผ๋ฅผ ์–ป๊ฒŒ ๋ฉ๋‹ˆ๋‹ค:

==Explanation from the agent==
I will use the following tools: `image_generator` to generate an image of a house, then `image_generator` to generate an image of a car.


==Code generated by the agent==
house_image = image_generator(prompt="A house")
car_image = image_generator(prompt="A car")

์šฐ๋ฆฌ๊ฐ€ ์—ผ๋‘์— ๋‘์—ˆ๋˜ ๊ฒƒ๊ณผ ํ™•์‹คํžˆ ๋” ๊ฐ€๊นŒ์›Œ์กŒ์Šต๋‹ˆ๋‹ค! ํ•˜์ง€๋งŒ ์ง‘๊ณผ ์ž๋™์ฐจ๊ฐ€ ๋ชจ๋‘ ๊ฐ™์€ ์ด๋ฏธ์ง€์— ํฌํ•จ๋˜๋ฉด ์ข‹๊ฒ ์Šต๋‹ˆ๋‹ค. ์ž‘์—…์„ ๋‹จ์ผ ์ด๋ฏธ์ง€ ์ƒ์„ฑ์— ๋” ์ง‘์ค‘ํ•˜๋ฉด ๋„์›€์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค:

agent.run("Create image: 'A house and car'", return_code=True)
==Explanation from the agent==
I will use the following tool: `image_generator` to generate an image.


==Code generated by the agent==
image = image_generator(prompt="A house and car")

์—์ด์ „ํŠธ๋Š” ์—ฌ์ „ํžˆ ํŠนํžˆ ์—ฌ๋Ÿฌ ๊ฐœ์ฒด์˜ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๊ฒƒ๊ณผ ๊ฐ™์ด ์•ฝ๊ฐ„ ๋” ๋ณต์žกํ•œ ์‚ฌ์šฉ ์‚ฌ๋ก€์—์„œ ์ทจ์•ฝํ•œ ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์Šต๋‹ˆ๋‹ค. ์•ž์œผ๋กœ ๋ช‡ ๋‹ฌ ์•ˆ์— ์—์ด์ „ํŠธ ์ž์ฒด์™€ ๊ธฐ๋ณธ ํ”„๋กฌํ”„ํŠธ๊ฐ€ ๋”์šฑ ๊ฐœ์„ ๋˜์–ด ์—์ด์ „ํŠธ๊ฐ€ ๋‹ค์–‘ํ•œ ์‚ฌ์šฉ์ž ์ž…๋ ฅ์— ๋”์šฑ ๊ฐ•๋ ฅํ•˜๊ฒŒ ๋Œ€์‘ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.

์ „์ฒด ํ”„๋กฌํ”„ํŠธ ์‚ฌ์šฉ์ž ์ •์˜ํ•˜๊ธฐ[[customizing-the-whole-prompt]]

์‚ฌ์šฉ์ž์—๊ฒŒ ์ตœ๋Œ€ํ•œ์˜ ์œ ์—ฐ์„ฑ์„ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•ด ์œ„์— ์„ค๋ช…๋œ ์ „์ฒด ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ์„ ์‚ฌ์šฉ์ž๊ฐ€ ๋ฎ์–ด์“ธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๊ฒฝ์šฐ ์‚ฌ์šฉ์ž ์ •์˜ ํ”„๋กฌํ”„ํŠธ์— ์†Œ๊ฐœ ์„น์…˜, ๋„๊ตฌ ์„น์…˜, ์˜ˆ์ œ ์„น์…˜ ๋ฐ ๋ฏธ์™„์„ฑ ์˜ˆ์ œ ์„น์…˜์ด ํฌํ•จ๋˜์–ด ์žˆ๋Š”์ง€ ํ™•์ธํ•˜์„ธ์š”. run ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ์„ ๋ฎ์–ด์“ฐ๋ ค๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค:

template = """ [...] """

agent = HfAgent(your_endpoint, run_prompt_template=template)

์—์ด์ „ํŠธ๊ฐ€ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋„๊ตฌ๋ฅผ ์ธ์‹ํ•˜๊ณ  ์‚ฌ์šฉ์ž์˜ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์‚ฝ์ž…ํ•  ์ˆ˜ ์žˆ๋„๋ก <<all_tools>> ๋ฌธ์ž์—ด๊ณผ <<prompt>>๋ฅผ template ์–ด๋”˜๊ฐ€์— ์ •์˜ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๋งˆ์ฐฌ๊ฐ€์ง€๋กœ chat ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ์„ ๋ฎ์–ด์“ธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. chat ๋ชจ๋“œ์—์„œ๋Š” ํ•ญ์ƒ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ตํ™˜ ํ˜•์‹์„ ์‚ฌ์šฉํ•œ๋‹ค๋Š” ์ ์— ์œ ์˜ํ•˜์„ธ์š”:

Human: <<task>>

Assistant:

๋”ฐ๋ผ์„œ ์‚ฌ์šฉ์ž ์ •์˜ chat ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ์˜ ์˜ˆ์ œ์—์„œ๋„ ์ด ํ˜•์‹์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ธ์Šคํ„ด์Šคํ™” ํ•  ๋•Œ chat ํ…œํ”Œ๋ฆฟ์„ ๋ฎ์–ด์“ธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

template = """ [...] """

agent = HfAgent(url_endpoint=your_endpoint, chat_prompt_template=template)

์—์ด์ „ํŠธ๊ฐ€ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋„๊ตฌ๋ฅผ ์ธ์‹ํ•  ์ˆ˜ ์žˆ๋„๋ก <<all_tools>> ๋ฌธ์ž์—ด์„ template ์–ด๋”˜๊ฐ€์— ์ •์˜ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๋‘ ๊ฒฝ์šฐ ๋ชจ๋‘ ์ปค๋ฎค๋‹ˆํ‹ฐ์˜ ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ํ˜ธ์ŠคํŒ…ํ•˜๋Š” ํ…œํ”Œ๋ฆฟ์„ ์‚ฌ์šฉํ•˜๋ ค๋Š” ๊ฒฝ์šฐ ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ ๋Œ€์‹  ์ €์žฅ์†Œ ID๋ฅผ ์ „๋‹ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ธฐ๋ณธ ํ”„๋กฌํ”„ํŠธ๋Š” ์ด ์ €์žฅ์†Œ๋ฅผ ์˜ˆ๋กœ ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Hub์˜ ์ €์žฅ์†Œ์— ์‚ฌ์šฉ์ž ์ •์˜ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์—…๋กœ๋“œํ•˜์—ฌ ์ปค๋ฎค๋‹ˆํ‹ฐ์™€ ๊ณต์œ ํ•˜๋ ค๋ฉด ๋‹ค์Œ์„ ํ™•์ธํ•˜์„ธ์š”:

  • ๋ฐ์ดํ„ฐ ์„ธํŠธ ์ €์žฅ์†Œ๋ฅผ ์‚ฌ์šฉํ•˜์„ธ์š”.
  • run ๋ช…๋ น์— ๋Œ€ํ•œ ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ์„ run_prompt_template.txt๋ผ๋Š” ํŒŒ์ผ์— ๋„ฃ์œผ์„ธ์š”.
  • chat ๋ช…๋ น์— ๋Œ€ํ•œ ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ์„ chat_prompt_template.txt๋ผ๋Š” ํŒŒ์ผ์— ๋„ฃ์œผ์„ธ์š”.

์‚ฌ์šฉ์ž ์ •์˜ ๋„๊ตฌ ์‚ฌ์šฉํ•˜๊ธฐ[[using-custom-tools]]

์ด ์„น์…˜์—์„œ๋Š” ์ด๋ฏธ์ง€ ์ƒ์„ฑ์— ํŠนํ™”๋œ ๋‘ ๊ฐ€์ง€ ๊ธฐ์กด ์‚ฌ์šฉ์ž ์ •์˜ ๋„๊ตฌ๋ฅผ ํ™œ์šฉํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค:

  • ๋” ๋งŽ์€ ์ด๋ฏธ์ง€ ์ˆ˜์ •์„ ํ—ˆ์šฉํ•˜๊ธฐ ์œ„ํ•ด huggingface-tools/image-transformation์„ diffusers/controlnet-canny-tool๋กœ ๋Œ€์ฒดํ•ฉ๋‹ˆ๋‹ค.
  • ๊ธฐ๋ณธ ๋„๊ตฌ ์ƒ์ž์— ์ด๋ฏธ์ง€ ์—…์Šค์ผ€์ผ๋ง์„ ์œ„ํ•œ ์ƒˆ๋กœ์šด ๋„๊ตฌ๊ฐ€ ์ถ”๊ฐ€๋˜์—ˆ์Šต๋‹ˆ๋‹ค: diffusers/latent-upscaler-tool๊ฐ€ ๊ธฐ์กด ์ด๋ฏธ์ง€ ๋ณ€ํ™˜ ๋„๊ตฌ๋ฅผ ๋Œ€์ฒดํ•ฉ๋‹ˆ๋‹ค.

ํŽธ๋ฆฌํ•œ [load_tool] ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž ์ •์˜ ๋„๊ตฌ๋ฅผ ๊ฐ€์ ธ์˜ค๋Š” ๊ฒƒ์œผ๋กœ ์‹œ์ž‘ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค:

from transformers import load_tool

controlnet_transformer = load_tool("diffusers/controlnet-canny-tool")
upscaler = load_tool("diffusers/latent-upscaler-tool")

์—์ด์ „ํŠธ์—๊ฒŒ ์‚ฌ์šฉ์ž ์ •์˜ ๋„๊ตฌ๋ฅผ ์ถ”๊ฐ€ํ•˜๋ฉด ๋„๊ตฌ์˜ ์„ค๋ช…๊ณผ ์ด๋ฆ„์ด ์—์ด์ „ํŠธ์˜ ํ”„๋กฌํ”„ํŠธ์— ์ž๋™์œผ๋กœ ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์—์ด์ „ํŠธ๊ฐ€ ์‚ฌ์šฉ ๋ฐฉ๋ฒ•์„ ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋„๋ก ์‚ฌ์šฉ์ž ์ •์˜ ๋„๊ตฌ์˜ ์„ค๋ช…๊ณผ ์ด๋ฆ„์„ ์ž˜ ์ž‘์„ฑํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. controlnet_transformer์˜ ์„ค๋ช…๊ณผ ์ด๋ฆ„์„ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค:

print(f"Description: '{controlnet_transformer.description}'")
print(f"Name: '{controlnet_transformer.name}'")

๊ทธ๋Ÿฌ๋ฉด ๋‹ค์Œ ๊ฒฐ๊ณผ๊ฐ€ ์ถœ๋ ฅ๋ฉ๋‹ˆ๋‹ค:

Description: 'This is a tool that transforms an image with ControlNet according to a prompt. 
It takes two inputs: `image`, which should be the image to transform, and `prompt`, which should be the prompt to use to change it. It returns the modified image.'
Name: 'image_transformer'

์ด๋ฆ„๊ณผ ์„ค๋ช…์ด ์ •ํ™•ํ•˜๊ณ  ํ๋ ˆ์ดํŒ… ๋œ ๋„๊ตฌ ์„ธํŠธ(curated set of tools)์˜ ์Šคํƒ€์ผ์— ๋งž์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ์œผ๋กœ, controlnet_transformer์™€ upscaler๋กœ ์—์ด์ „ํŠธ๋ฅผ ์ธ์Šคํ„ด์Šคํ™”ํ•ด ๋ด…์‹œ๋‹ค:

tools = [controlnet_transformer, upscaler]
agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder", additional_tools=tools)

์ด ๋ช…๋ น์„ ์‹คํ–‰ํ•˜๋ฉด ๋‹ค์Œ ์ •๋ณด๊ฐ€ ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค:

image_transformer has been replaced by <transformers_modules.diffusers.controlnet-canny-tool.bd76182c7777eba9612fc03c0
8718a60c0aa6312.image_transformation.ControlNetTransformationTool object at 0x7f1d3bfa3a00> as provided in `additional_tools`

ํ๋ ˆ์ดํŒ…๋œ ๋„๊ตฌ ์„ธํŠธ์—๋Š” ์ด๋ฏธ 'image_transformer' ๋„๊ตฌ๊ฐ€ ์žˆ์œผ๋ฉฐ, ์ด ๋„๊ตฌ๋Š” ์‚ฌ์šฉ์ž ์ •์˜ ๋„๊ตฌ๋กœ ๋Œ€์ฒด๋ฉ๋‹ˆ๋‹ค.

๊ธฐ์กด ๋„๊ตฌ์™€ ๋˜‘๊ฐ™์€ ์ž‘์—…์— ์‚ฌ์šฉ์ž ์ •์˜ ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋Š” ๊ฒฝ์šฐ ๊ธฐ์กด ๋„๊ตฌ๋ฅผ ๋ฎ์–ด์“ฐ๋Š” ๊ฒƒ์ด ์œ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์—์ด์ „ํŠธ๊ฐ€ ํ•ด๋‹น ์ž‘์—…์— ๋Šฅ์ˆ™ํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ์ด ๊ฒฝ์šฐ ์‚ฌ์šฉ์ž ์ •์˜ ๋„๊ตฌ๊ฐ€ ๋ฎ์–ด์“ด ๋„๊ตฌ์™€ ์ •ํ™•ํžˆ ๋™์ผํ•œ API๋ฅผ ๋”ฐ๋ผ์•ผ ํ•˜๋ฉฐ, ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด ํ•ด๋‹น ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ชจ๋“  ์˜ˆ์ œ๊ฐ€ ์—…๋ฐ์ดํŠธ๋˜๋„๋ก ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ์„ ์กฐ์ •ํ•ด์•ผ ํ•œ๋‹ค๋Š” ์ ์— ์œ ์˜ํ•˜์„ธ์š”.

์—…์Šค์ผ€์ผ๋Ÿฌ ๋„๊ตฌ์— ์ง€์ •๋œ 'image_upscaler'๋ผ๋Š” ์ด๋ฆ„ ์•„์ง ๊ธฐ๋ณธ ๋„๊ตฌ ์ƒ์ž์—๋Š” ์กด์žฌํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์—, ๋„๊ตฌ ๋ชฉ๋ก์— ํ•ด๋‹น ์ด๋ฆ„์ด ๊ฐ„๋‹จํžˆ ์ถ”๊ฐ€๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์—์ด์ „ํŠธ๊ฐ€ ํ˜„์žฌ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋„๊ตฌ ์ƒ์ž๋Š” ์–ธ์ œ๋“ ์ง€ agent.toolbox ์†์„ฑ์„ ํ†ตํ•ด ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

print("\n".join([f"- {a}" for a in agent.toolbox.keys()]))
- document_qa
- image_captioner
- image_qa
- image_segmenter
- transcriber
- summarizer
- text_classifier
- text_qa
- text_reader
- translator
- image_transformer
- text_downloader
- image_generator
- video_generator
- image_upscaler

์—์ด์ „ํŠธ์˜ ๋„๊ตฌ ์ƒ์ž์— image_upscaler๊ฐ€ ์ถ”๊ฐ€๋œ ์ ์„ ์ฃผ๋ชฉํ•˜์„ธ์š”.

์ด์ œ ์ƒˆ๋กœ์šด ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•ด๋ด…์‹œ๋‹ค! Transformers Agents Quickstart์—์„œ ์ƒ์„ฑํ•œ ์ด๋ฏธ์ง€๋ฅผ ๋‹ค์‹œ ์‚ฌ์šฉํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

from diffusers.utils import load_image

image = load_image(
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rivers_and_lakes.png"
)

์ด๋ฏธ์ง€๋ฅผ ์•„๋ฆ„๋‹ค์šด ๊ฒจ์šธ ํ’๊ฒฝ์œผ๋กœ ๋ฐ”๊ฟ” ๋ด…์‹œ๋‹ค:

image = agent.run("Transform the image: 'A frozen lake and snowy forest'", image=image)
==Explanation from the agent==
I will use the following tool: `image_transformer` to transform the image.


==Code generated by the agent==
image = image_transformer(image, prompt="A frozen lake and snowy forest")

์ƒˆ๋กœ์šด ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ ๋„๊ตฌ๋Š” ์ด๋ฏธ์ง€๋ฅผ ๋งค์šฐ ๊ฐ•๋ ฅํ•˜๊ฒŒ ์ˆ˜์ •ํ•  ์ˆ˜ ์žˆ๋Š” ControlNet์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ์ ์œผ๋กœ ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ ๋„๊ตฌ๋Š” 512x512 ํ”ฝ์…€ ํฌ๊ธฐ์˜ ์ด๋ฏธ์ง€๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ์—…์Šค์ผ€์ผ๋งํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์‚ดํŽด๋ด…์‹œ๋‹ค.

image = agent.run("Upscale the image", image)
==Explanation from the agent==
I will use the following tool: `image_upscaler` to upscale the image.


==Code generated by the agent==
upscaled_image = image_upscaler(image)

์—์ด์ „ํŠธ๋Š” ์—…์Šค์ผ€์ผ๋Ÿฌ ๋„๊ตฌ์˜ ์„ค๋ช…๊ณผ ์ด๋ฆ„๋งŒ ๋ณด๊ณ  ๋ฐฉ๊ธˆ ์ถ”๊ฐ€ํ•œ ์—…์Šค์ผ€์ผ๋Ÿฌ ๋„๊ตฌ์— "์ด๋ฏธ์ง€ ์—…์Šค์ผ€์ผ๋ง"์ด๋ผ๋Š” ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ž๋™์œผ๋กœ ๋งคํ•‘ํ•˜์—ฌ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์‹คํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค.

๋‹ค์Œ์œผ๋กœ ์ƒˆ ์‚ฌ์šฉ์ž ์ •์˜ ๋„๊ตฌ๋ฅผ ๋งŒ๋“œ๋Š” ๋ฐฉ๋ฒ•์„ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

์ƒˆ ๋„๊ตฌ ์ถ”๊ฐ€ํ•˜๊ธฐ[[adding-new-tools]]

์ด ์„น์…˜์—์„œ๋Š” ์—์ด์ „ํŠธ์—๊ฒŒ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š” ์ƒˆ ๋„๊ตฌ๋ฅผ ๋งŒ๋“œ๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

์ƒˆ ๋„๊ตฌ ๋งŒ๋“ค๊ธฐ[[creating-a-new-tool]]

๋จผ์ € ๋„๊ตฌ๋ฅผ ๋งŒ๋“œ๋Š” ๊ฒƒ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ํŠน์ • ์ž‘์—…์— ๋Œ€ํ•ด ๊ฐ€์žฅ ๋งŽ์€ ๋‹ค์šด๋กœ๋“œ๋ฅผ ๋ฐ›์€ Hugging Face Hub์˜ ๋ชจ๋ธ์„ ๊ฐ€์ ธ์˜ค๋Š”, ๊ทธ๋‹ค์ง€ ์œ ์šฉํ•˜์ง€๋Š” ์•Š์ง€๋งŒ ์žฌ๋ฏธ์žˆ๋Š” ์ž‘์—…์„ ์ถ”๊ฐ€ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

๋‹ค์Œ ์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค:

from huggingface_hub import list_models

task = "text-classification"

model = next(iter(list_models(filter=task, sort="downloads", direction=-1)))
print(model.id)

text-classification(ํ…์ŠคํŠธ ๋ถ„๋ฅ˜) ์ž‘์—…์˜ ๊ฒฝ์šฐ 'facebook/bart-large-mnli'๋ฅผ ๋ฐ˜ํ™˜ํ•˜๊ณ , translation(๋ฒˆ์—ญ) ์ž‘์—…์˜ ๊ฒฝ์šฐ 't5-base'๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

์ด๋ฅผ ์—์ด์ „ํŠธ๊ฐ€ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋„๊ตฌ๋กœ ๋ณ€ํ™˜ํ•˜๋ ค๋ฉด ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ• ๊นŒ์š”? ๋ชจ๋“  ๋„๊ตฌ๋Š” ํ•„์š”ํ•œ ์ฃผ์š” ์†์„ฑ์„ ๋ณด์œ ํ•˜๋Š” ์Šˆํผํด๋ž˜์Šค Tool์— ์˜์กดํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ์ƒ์†ํ•˜๋Š” ํด๋ž˜์Šค๋ฅผ ๋งŒ๋“ค์–ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค:

from transformers import Tool


class HFModelDownloadsTool(Tool):
    pass

์ด ํด๋ž˜์Šค์—๋Š” ๋ช‡ ๊ฐ€์ง€ ์š”๊ตฌ์‚ฌํ•ญ์ด ์žˆ์Šต๋‹ˆ๋‹ค:

  • ๋„๊ตฌ ์ž์ฒด์˜ ์ด๋ฆ„์— ํ•ด๋‹นํ•˜๋Š” name ์†์„ฑ. ์ˆ˜ํ–‰๋ช…์ด ์žˆ๋Š” ๋‹ค๋ฅธ ๋„๊ตฌ์™€ ํ˜ธํ™˜๋˜๋„๋ก model_download_counter๋กœ ์ด๋ฆ„์„ ์ง€์ •ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.
  • ์—์ด์ „ํŠธ์˜ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ฑ„์šฐ๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š” ์†์„ฑ description.
  • inputs ๋ฐ outputs ์†์„ฑ. ์ด๋ฅผ ์ •์˜ํ•˜๋ฉด Python ์ธํ„ฐํ”„๋ฆฌํ„ฐ๊ฐ€ ์œ ํ˜•์— ๋Œ€ํ•œ ์ •๋ณด์— ์ž…๊ฐํ•œ ์„ ํƒ์„ ํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋˜๋ฉฐ, ๋„๊ตฌ๋ฅผ ํ—ˆ๋ธŒ์— ํ‘ธ์‹œํ•  ๋•Œ gradio ๋ฐ๋ชจ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‘ ์†์„ฑ ๋ชจ๋‘ ๊ฐ’์€ 'ํ…์ŠคํŠธ', '์ด๋ฏธ์ง€' ๋˜๋Š” '์˜ค๋””์˜ค'๊ฐ€ ๋  ์ˆ˜ ์žˆ๋Š” ์˜ˆ์ƒ ๊ฐ’์˜ ๋ฆฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค.
  • ์ถ”๋ก  ์ฝ”๋“œ๊ฐ€ ํฌํ•จ๋œ __call__ ๋ฉ”์†Œ๋“œ. ์ด๊ฒƒ์ด ์šฐ๋ฆฌ๊ฐ€ ์œ„์—์„œ ๋‹ค๋ฃจ์—ˆ๋˜ ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค!

์ด์ œ ํด๋ž˜์Šค์˜ ๋ชจ์Šต์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

from transformers import Tool
from huggingface_hub import list_models


class HFModelDownloadsTool(Tool):
    name = "model_download_counter"
    description = (
        "This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub. "
        "It takes the name of the category (such as text-classification, depth-estimation, etc), and "
        "returns the name of the checkpoint."
    )

    inputs = ["text"]
    outputs = ["text"]

    def __call__(self, task: str):
        model = next(iter(list_models(filter=task, sort="downloads", direction=-1)))
        return model.id

์ด์ œ ๋„๊ตฌ๋ฅผ ์†์‰ฝ๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๋„๊ตฌ๋ฅผ ํŒŒ์ผ์— ์ €์žฅํ•˜๊ณ  ๋ฉ”์ธ ์Šคํฌ๋ฆฝํŠธ์—์„œ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค. ์ด ํŒŒ์ผ์˜ ์ด๋ฆ„์„ model_downloads.py๋กœ ์ง€์ •ํ•˜๋ฉด ๊ฒฐ๊ณผ์ ์œผ๋กœ ๊ฐ€์ ธ์˜ค๊ธฐ ์ฝ”๋“œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

from model_downloads import HFModelDownloadsTool

tool = HFModelDownloadsTool()

๋‹ค๋ฅธ ์‚ฌ๋žŒ๋“ค์ด ์ด ๊ธฐ๋Šฅ์„ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๊ณ  ์ดˆ๊ธฐํ™”๋ฅผ ๋” ๊ฐ„๋‹จํ•˜๊ฒŒ ํ•˜๋ ค๋ฉด ๋„ค์ž„์ŠคํŽ˜์ด์Šค ์•„๋ž˜์˜ Hub๋กœ ํ‘ธ์‹œํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. ๊ทธ๋ ‡๊ฒŒ ํ•˜๋ ค๋ฉด tool ๋ณ€์ˆ˜์—์„œ push_to_hub๋ฅผ ํ˜ธ์ถœํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค:

tool.push_to_hub("hf-model-downloads")

์ด์ œ ํ—ˆ๋ธŒ์— ์ฝ”๋“œ๊ฐ€ ์ƒ๊ฒผ์Šต๋‹ˆ๋‹ค! ๋งˆ์ง€๋ง‰ ๋‹จ๊ณ„์ธ ์—์ด์ „ํŠธ๊ฐ€ ์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๋„๋ก ํ•˜๋Š” ๋‹จ๊ณ„๋ฅผ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

์—์ด์ „ํŠธ๊ฐ€ ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•˜๊ฒŒ ํ•˜๊ธฐ[[Having-the-agent-use-the-tool]]

์ด์ œ ์ด๋Ÿฐ ์‹์œผ๋กœ ํ—ˆ๋ธŒ์— ์กด์žฌํ•˜๋Š” ๋„๊ตฌ๋ฅผ ์ธ์Šคํ„ด์Šคํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค(๋„๊ตฌ์˜ ์‚ฌ์šฉ์ž ์ด๋ฆ„์€ ๋ณ€๊ฒฝํ•˜์„ธ์š”): We now have our tool that lives on the Hub which can be instantiated as such (change the user name for your tool):

from transformers import load_tool

tool = load_tool("lysandre/hf-model-downloads")

์ด ๋„๊ตฌ๋ฅผ ์—์ด์ „ํŠธ์—์„œ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ์—์ด์ „ํŠธ ์ดˆ๊ธฐํ™” ๋ฉ”์†Œ๋“œ์˜ additional_tools ๋งค๊ฐœ๋ณ€์ˆ˜์— ์ „๋‹ฌํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค:

from transformers import HfAgent

agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder", additional_tools=[tool])

agent.run(
    "Can you read out loud the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub?"
)

๊ทธ๋Ÿฌ๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ฒฐ๊ณผ๊ฐ€ ์ถœ๋ ฅ๋ฉ๋‹ˆ๋‹ค:

==Code generated by the agent==
model = model_download_counter(task="text-to-video")
print(f"The model with the most downloads is {model}.")
audio_model = text_reader(model)


==Result==
The model with the most downloads is damo-vilab/text-to-video-ms-1.7b.

and generates the following audio.

Audio

LLM์— ๋”ฐ๋ผ ์ผ๋ถ€๋Š” ๋งค์šฐ ์ทจ์•ฝํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ œ๋Œ€๋กœ ์ž‘๋™ํ•˜๋ ค๋ฉด ๋งค์šฐ ์ •ํ™•ํ•œ ํ”„๋กฌํ”„ํŠธ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ์—์ด์ „ํŠธ๊ฐ€ ๋„๊ตฌ๋ฅผ ์ž˜ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋„๊ตฌ์˜ ์ด๋ฆ„๊ณผ ์„ค๋ช…์„ ์ž˜ ์ •์˜ํ•˜๋Š” ๊ฒƒ์ด ๋ฌด์—‡๋ณด๋‹ค ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.

๊ธฐ์กด ๋„๊ตฌ ๋Œ€์ฒดํ•˜๊ธฐ[[replacing-existing-tools]]

์—์ด์ „ํŠธ์˜ ๋„๊ตฌ ์ƒ์ž์— ์ƒˆ ํ•ญ๋ชฉ์„ ๋ฐฐ์ •ํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ๊ธฐ์กด ๋„๊ตฌ๋ฅผ ๋Œ€์ฒดํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฐฉ๋ฒ•์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

from transformers import HfAgent, load_tool

agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder")
agent.toolbox["image-transformation"] = load_tool("diffusers/controlnet-canny-tool")

๋‹ค๋ฅธ ๋„๊ตฌ๋กœ ๊ต์ฒดํ•  ๋•Œ๋Š” ์ฃผ์˜ํ•˜์„ธ์š”! ์ด ์ž‘์—…์œผ๋กœ ์—์ด์ „ํŠธ์˜ ํ”„๋กฌํ”„ํŠธ๋„ ์กฐ์ •๋ฉ๋‹ˆ๋‹ค. ์ž‘์—…์— ๋” ์ ํ•ฉํ•œ ํ”„๋กฌํ”„ํŠธ๊ฐ€ ์žˆ์œผ๋ฉด ์ข‹์„ ์ˆ˜ ์žˆ์ง€๋งŒ, ๋‹ค๋ฅธ ๋„๊ตฌ๋ณด๋‹ค ๋” ๋งŽ์ด ์„ ํƒ๋˜๊ฑฐ๋‚˜ ์ •์˜ํ•œ ๋„๊ตฌ ๋Œ€์‹  ๋‹ค๋ฅธ ๋„๊ตฌ๊ฐ€ ์„ ํƒ๋  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

gradio-tools ์‚ฌ์šฉํ•˜๊ธฐ[[leveraging-gradio-tools]]

gradio-tools๋Š” Hugging Face Spaces๋ฅผ ๋„๊ตฌ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ•๋ ฅํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์ž…๋‹ˆ๋‹ค. ๊ธฐ์กด์˜ ๋งŽ์€ Spaces๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์‚ฌ์šฉ์ž ์ •์˜ Spaces๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋””์ž์ธํ•  ์ˆ˜ ์žˆ๋„๋ก ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

์šฐ๋ฆฌ๋Š” Tool.from_gradio ๋ฉ”์†Œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ gradio_tools์— ๋Œ€ํ•œ ์ง€์›์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ํ”„๋กฌํ”„ํŠธ๋ฅผ ๊ฐœ์„ ํ•˜๊ณ  ๋” ๋‚˜์€ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด gradio-tools ํˆดํ‚ท์—์„œ ์ œ๊ณต๋˜๋Š” StableDiffusionPromptGeneratorTool ๋„๊ตฌ๋ฅผ ํ™œ์šฉํ•˜๊ณ ์ž ํ•ฉ๋‹ˆ๋‹ค.

๋จผ์ € gradio_tools์—์„œ ๋„๊ตฌ๋ฅผ ๊ฐ€์ ธ์™€์„œ ์ธ์Šคํ„ด์Šคํ™”ํ•ฉ๋‹ˆ๋‹ค:

from gradio_tools import StableDiffusionPromptGeneratorTool

gradio_tool = StableDiffusionPromptGeneratorTool()

ํ•ด๋‹น ์ธ์Šคํ„ด์Šค๋ฅผ Tool.from_gradio ๋ฉ”์†Œ๋“œ์— ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค:

from transformers import Tool

tool = Tool.from_gradio(gradio_tool)

์ด์ œ ์ผ๋ฐ˜์ ์ธ ์‚ฌ์šฉ์ž ์ •์˜ ๋„๊ตฌ์™€ ๋˜‘๊ฐ™์ด ๊ด€๋ฆฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ํ™œ์šฉํ•˜์—ฌ `a rabbit wearing a space suit'(์šฐ์ฃผ๋ณต์„ ์ž…์€ ํ† ๋ผ)๋ผ๋Š” ํ”„๋กฌํ”„ํŠธ๋ฅผ ๊ฐœ์„ ํ–ˆ์Šต๋‹ˆ๋‹ค:

from transformers import HfAgent

agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder", additional_tools=[tool])

agent.run("Generate an image of the `prompt` after improving it.", prompt="A rabbit wearing a space suit")

๋ชจ๋ธ์ด ๋„๊ตฌ๋ฅผ ์ ์ ˆํžˆ ํ™œ์šฉํ•ฉ๋‹ˆ๋‹ค:

==Explanation from the agent==
I will use the following  tools: `StableDiffusionPromptGenerator` to improve the prompt, then `image_generator` to generate an image according to the improved prompt.


==Code generated by the agent==
improved_prompt = StableDiffusionPromptGenerator(prompt)
print(f"The improved prompt is {improved_prompt}.")
image = image_generator(improved_prompt)

๋งˆ์ง€๋ง‰์œผ๋กœ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๊ธฐ ์ „์—:

gradio-tools๋Š” ๋‹ค๋ฅธ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ๋กœ ์ž‘์—…ํ•  ๋•Œ์—๋„ ํ…์ŠคํŠธ ์ž…๋ ฅ ๋ฐ ์ถœ๋ ฅ์„ ํ•„์š”๋กœ ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ตฌํ˜„์€ ์ด๋ฏธ์ง€ ๋ฐ ์˜ค๋””์˜ค ๊ฐ์ฒด์—์„œ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ํ˜„์žฌ๋Š” ์ด ๋‘ ๊ฐ€์ง€๊ฐ€ ํ˜ธํ™˜๋˜์ง€ ์•Š์ง€๋งŒ ์ง€์› ๊ฐœ์„ ์„ ์œ„ํ•ด ๋…ธ๋ ฅํ•˜๋ฉด์„œ ๋น ๋ฅด๊ฒŒ ํ˜ธํ™˜๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

ํ–ฅํ›„ Langchain๊ณผ์˜ ํ˜ธํ™˜์„ฑ[[future-compatibility-with-langchain]]

์ €ํฌ๋Š” Langchain์„ ์ข‹์•„ํ•˜๋ฉฐ ๋งค์šฐ ๋งค๋ ฅ์ ์ธ ๋„๊ตฌ ๋ชจ์Œ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋„๊ตฌ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด Langchain์€ ๋‹ค๋ฅธ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ์™€ ์ž‘์—…ํ•  ๋•Œ์—๋„ ํ…์ŠคํŠธ ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ์„ ํ•„์š”๋กœ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ์ข…์ข… ๊ฐ์ฒด์˜ ์ง๋ ฌํ™”๋œ(์ฆ‰, ๋””์Šคํฌ์— ์ €์žฅ๋œ) ๋ฒ„์ „์ž…๋‹ˆ๋‹ค.

์ด ์ฐจ์ด๋กœ ์ธํ•ด transformers-agents์™€ Langchain ๊ฐ„์—๋Š” ๋ฉ€ํ‹ฐ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ๊ฐ€ ์ฒ˜๋ฆฌ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ํ–ฅํ›„ ๋ฒ„์ „์—์„œ ์ด ์ œํ•œ์ด ํ•ด๊ฒฐ๋˜๊ธฐ๋ฅผ ๋ฐ”๋ผ๋ฉฐ, ์ด ํ˜ธํ™˜์„ฑ์„ ๋‹ฌ์„ฑํ•  ์ˆ˜ ์žˆ๋„๋ก ์—ด๋ ฌํ•œ Langchain ์‚ฌ์šฉ์ž์˜ ๋„์›€์„ ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค.

์ €ํฌ๋Š” ๋” ๋‚˜์€ ์ง€์›์„ ์ œ๊ณตํ•˜๊ณ ์ž ํ•ฉ๋‹ˆ๋‹ค. ๋„์›€์„ ์ฃผ๊ณ  ์‹ถ์œผ์‹œ๋‹ค๋ฉด, ์ด์Šˆ๋ฅผ ์—ด์–ด ์˜๊ฒฌ์„ ๊ณต์œ ํ•ด ์ฃผ์„ธ์š”.