Text Generation
Transformers
PyTorch
TensorBoard
Safetensors
bloom
Eval Results
text-generation-inference
Inference Endpoints

How do i increase or decrease size of response in API call?

#263
by Iluvmelons - opened
async function query() {
    const requestData = {
        inputs: dream
    };

    const response = await fetch(
        "https://api-inference.huggingface.co/models/bigscience/bloom",
        {
            headers: {
                Authorization: "Bearer {key}",
                "Content-Type": "application/json"
            },
            method: "POST",
            body: JSON.stringify(requestData),
        }
    );

    if (response.status === 200) {
        const result = await response.json();
        console.log(result);
        return result;
    } else {
        console.error("Error:", response.status, response.statusText);
    }
}

above is the format i am using

Hey @Iluvmelons , do you mean the number of tokens generated by the model?

It's using the Text Generation Inference API, you can read the documentation here: https://huggingface.co/docs/api-inference/detailed_parameters#text-generation-task

All parameters are available there; the one you're most interested is likely max_new_tokens or max_time?

Sign up or log in to comment