Crashes or generates gibberish

#1
by Pumba2 - opened

GPT4all: crashes the whole app
KOboldCPP: Generates gibberish

Working fine in latest llama.cpp. Please report the issues to the respective developers of those programs.

Also, try testing the non q-quants, like Q4_0 or Q5_0, if you didn't already.

This is an unusual size model and k-quants would not normally be supported for many of the layers, due to them not being divisible by 256. But due to a recent llama.cpp change, those layers are now done using non-k-quant methods, like Q4_0. This enables me to make k-quants. I don't know if that could cause issues for clients besides llama.cpp, but it's worth mentioning.

llama.cpp tests:

Q6_K:
system_info: n_threads = 56 / 112 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 0 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
sampling:
repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
generate: n_ctx = 512, n_batch = 512, n_predict = 256, n_keep = 0

You are an AI programming assistant, utilizing the Deepseek Coder model, developed by Deepseek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer.

Instruction:

write a quick sort algorithm in python.

Response:

Sure! Here is the Python implementation of Quick Sort Algorithm using recursion:

def quicksort(arr):
    if len(arr) <= 1: # base case, return array directly when length less than or equal to one (already sorted).
        return arr
    else:              # recurse on the rest of the list.
                        # pivot is always first element in partitioning step for simplicity and efficiency reasons – usually randomized choice as well!
                      value_to_compare = arr[0]
                  lower = [x for x in arr[1:] if x < value_to_compare]
                 upper = [y for y in arr[1:] if y >=value_to_compare ]
                     return quicksort(lower) + [value_to_compare] +quicksort(upper )   # concatenate and sort the lists – this is our recursive call!

You can run above function using: print (quickSort([3,60,15]). You will get output as : '[27].' which means it sorts array correctly in asc
llama_print_timings: load time = 1280.79 ms
llama_print_timings: sample time = 112.96 ms / 256 runs ( 0.44 ms per token, 2266.25 tokens per second)
llama_print_timings: prompt eval time = 33.17 ms / 75 tokens ( 0.44 ms per token, 2260.81 tokens per second)
llama_print_timings: eval time = 1147.97 ms / 255 runs ( 4.50 ms per token, 222.13 tokens per second)
llama_print_timings: total time = 1359.18 ms
Log end

Q4_0:

system_info: n_threads = 56 / 112 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 0 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
sampling:
repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
generate: n_ctx = 512, n_batch = 512, n_predict = 256, n_keep = 0

You are an AI programming assistant, utilizing the Deepseek Coder model, developed by Deepseek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer.

Instruction:

write a quick sort algorithm in python.

Response:

Sure! Here's an implementation of Quick Sort Algorithm using Python programming language as requested by the problem statement. The quick_sort function takes two arguments - list and start index (default is 0) to avoid passing unnecessary parameters, which can save some time for large lists or when recursion depth exceeded error occurs due to too many iterations:

def quick_sort(arr):     # Input must be a sorted array. For non-sorted data set the function will not work correctly but sort it with "quick_sort" first before calling this method on your original list/array if you're sure about input being already in order and want to save some time
    less = []            # List of elements smaller than pivot.  Initialised as empty at start, so will always be sorted by recursive calls inside quicksort(). These are the 'lesser-than' or lesser values we have yet seen while partitioning around our chosen "pivotal" value in arr
    greater = []         # List of elements larger than pivot.  Initialised as empty at start, so will always be sorted by recursive calls inside quicksort(). These are the 'greater-than'
llama_print_timings:        load time =     658.90 ms
llama_print_timings:      sample time =      99.56 ms /   256 runs   (    0.39 ms per token,  2571.21 tokens per second)
llama_print_timings: prompt eval time =      27.82 ms /    75 tokens (    0.37 ms per token,  2695.90 tokens per second)
llama_print_timings:        eval time =     940.49 ms /   255 runs   (    3.69 ms per token,   271.14 tokens per second)
llama_print_timings:       total time =    1133.29 ms
Log end

ok thanks for the fast reply. Have a good Sunday !

Sign up or log in to comment