TheBloke/deepseek-coder-1.3b-instruct-GGUF · Crashes or generates gibberish

Pumba2

Nov 5, 2023

GPT4all: crashes the whole app
KOboldCPP: Generates gibberish

TheBloke

Owner Nov 5, 2023

Working fine in latest llama.cpp. Please report the issues to the respective developers of those programs.

Also, try testing the non q-quants, like Q4_0 or Q5_0, if you didn't already.

This is an unusual size model and k-quants would not normally be supported for many of the layers, due to them not being divisible by 256. But due to a recent llama.cpp change, those layers are now done using non-k-quant methods, like Q4_0. This enables me to make k-quants. I don't know if that could cause issues for clients besides llama.cpp, but it's worth mentioning.

llama.cpp tests:

Q6_K:
system_info: n_threads = 56 / 112 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 0 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
sampling:
repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
generate: n_ctx = 512, n_batch = 512, n_predict = 256, n_keep = 0

You are an AI programming assistant, utilizing the Deepseek Coder model, developed by Deepseek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer.

Instruction:

write a quick sort algorithm in python.

Response:

Sure! Here is the Python implementation of Quick Sort Algorithm using recursion:

def quicksort(arr):
    if len(arr) <= 1: # base case, return array directly when length less than or equal to one (already sorted).
        return arr
    else:              # recurse on the rest of the list.
                        # pivot is always first element in partitioning step for simplicity and efficiency reasons – usually randomized choice as well!
                      value_to_compare = arr[0]
                  lower = [x for x in arr[1:] if x < value_to_compare]
                 upper = [y for y in arr[1:] if y >=value_to_compare ]
                     return quicksort(lower) + [value_to_compare] +quicksort(upper )   # concatenate and sort the lists – this is our recursive call!

You can run above function using: print (quickSort([3,60,15]). You will get output as : '[27].' which means it sorts array correctly in asc
llama_print_timings: load time = 1280.79 ms
llama_print_timings: sample time = 112.96 ms / 256 runs ( 0.44 ms per token, 2266.25 tokens per second)
llama_print_timings: prompt eval time = 33.17 ms / 75 tokens ( 0.44 ms per token, 2260.81 tokens per second)
llama_print_timings: eval time = 1147.97 ms / 255 runs ( 4.50 ms per token, 222.13 tokens per second)
llama_print_timings: total time = 1359.18 ms
Log end

Q4_0:

system_info: n_threads = 56 / 112 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 0 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
sampling:
repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
generate: n_ctx = 512, n_batch = 512, n_predict = 256, n_keep = 0