You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
poetry run python -m private_gpt
10:51:41.232 [INFO ] private_gpt.settings.settings_loader - Starting application with profiles=['default', 'local'] 10:51:46.763 [INFO ] private_gpt.components.llm.llm_component - Tokenizer successfully initialized with pre-tokenizer settings.
10:51:48.248 [INFO ] private_gpt.components.llm.llm_component - Attempting to initialize tokenizer from C:\SharedTools\YODA\private-gpt\models\Tokenizer
10:51:48.451 [INFO ] private_gpt.components.llm.llm_component - Initializing the LLM in mode=llamacpp
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 2 CUDA devices:
Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes
Device 1: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes
llama_load_model_from_file: using device CUDA0 (NVIDIA GeForce RTX 4090) - 22994 MiB free
llama_load_model_from_file: using device CUDA1 (NVIDIA GeForce RTX 4090) - 22994 MiB free
llama_model_loader: loaded meta data with 22 key-value pairs and 291 tensors from C:\SharedTools\YODA\private-gpt\models\Llama-3-Smaug-8B-Q8_0.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = llama
llama_model_loader: - kv 1: general.name str = Llama-3-Smaug-8B
llama_model_loader: - kv 2: llama.block_count u32 = 32
llama_model_loader: - kv 3: llama.context_length u32 = 8192
llama_model_loader: - kv 4: llama.embedding_length u32 = 4096
llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336
llama_model_loader: - kv 6: llama.attention.head_count u32 = 32
llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8
llama_model_loader: - kv 8: llama.rope.freq_base f32 = 500000.000000
llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010
llama_model_loader: - kv 10: general.file_type u32 = 7
llama_model_loader: - kv 11: llama.vocab_size u32 = 128256
llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128
llama_model_loader: - kv 13: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 14: tokenizer.ggml.tokens arr[str,128256] = ["!", """, "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 15: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 16: tokenizer.ggml.merges arr[str,280147] = ["\u0120 \u0120", "\u0120 \u0120\u0120\u0120", "\u0120\u0120 \u0120\u0120", "...
llama_model_loader: - kv 17: tokenizer.ggml.bos_token_id u32 = 128000
llama_model_loader: - kv 18: tokenizer.ggml.eos_token_id u32 = 128001
llama_model_loader: - kv 19: tokenizer.ggml.padding_token_id u32 = 128001
llama_model_loader: - kv 20: tokenizer.chat_template str = {% set loop_messages = messages %}{% ...
llama_model_loader: - kv 21: general.quantization_version u32 = 2
llama_model_loader: - type f32: 65 tensors
llama_model_loader: - type q8_0: 226 tensors
llm_load_vocab: missing pre-tokenizer type, using: 'default'
llm_load_vocab:
llm_load_vocab: ************************************
llm_load_vocab: GENERATION QUALITY WILL BE DEGRADED!
llm_load_vocab: CONSIDER REGENERATING THE MODEL
llm_load_vocab: ************************************
llm_load_vocab:
llm_load_vocab: control-looking token: 128009 '<|eot_id|>' was not control-type; this is probably a bug in the model. its type will be overridden
llm_load_vocab: control token: 128095 '<|reserved_special_token_90|>' is not marked as EOG
llm_load_vocab: control token: 128188 '<|reserved_special_token_183|>' is not marked as EOG
llm_load_vocab: control token: 128127 '<|reserved_special_token_122|>' is not marked as EOG
llm_load_vocab: control token: 128116 '<|reserved_special_token_111|>' is not marked as EOG
llm_load_vocab: control token: 128133 '<|reserved_special_token_128|>' is not marked as EOG
llm_load_vocab: control token: 128152 '<|reserved_special_token_147|>' is not marked as EOG
llm_load_vocab: control token: 128193 '<|reserved_special_token_188|>' is not marked as EOG
llm_load_vocab: control token: 128098 '<|reserved_special_token_93|>' is not marked as EOG
llm_load_vocab: control token: 128135 '<|reserved_special_token_130|>' is not marked as EOG
llm_load_vocab: control token: 128178 '<|reserved_special_token_173|>' is not marked as EOG
llm_load_vocab: control token: 128031 '<|reserved_special_token_26|>' is not marked as EOG
llm_load_vocab: control token: 128206 '<|reserved_special_token_201|>' is not marked as EOG
llm_load_vocab: control token: 128028 '<|reserved_special_token_23|>' is not marked as EOG
llm_load_vocab: control token: 128047 '<|reserved_special_token_42|>' is not marked as EOG
llm_load_vocab: control token: 128241 '<|reserved_special_token_236|>' is not marked as EOG
llm_load_vocab: control token: 128145 '<|reserved_special_token_140|>' is not marked as EOG
llm_load_vocab: control token: 128195 '<|reserved_special_token_190|>' is not marked as EOG
llm_load_vocab: control token: 128226 '<|reserved_special_token_221|>' is not marked as EOG
llm_load_vocab: control token: 128162 '<|reserved_special_token_157|>' is not marked as EOG
llm_load_vocab: control token: 128243 '<|reserved_special_token_238|>' is not marked as EOG
llm_load_vocab: control token: 128014 '<|reserved_special_token_9|>' is not marked as EOG
llm_load_vocab: control token: 128213 '<|reserved_special_token_208|>' is not marked as EOG
llm_load_vocab: control token: 128158 '<|reserved_special_token_153|>' is not marked as EOG
llm_load_vocab: control token: 128221 '<|reserved_special_token_216|>' is not marked as EOG
llm_load_vocab: control token: 128079 '<|reserved_special_token_74|>' is not marked as EOG
llm_load_vocab: control token: 128246 '<|reserved_special_token_241|>' is not marked as EOG
llm_load_vocab: control token: 128104 '<|reserved_special_token_99|>' is not marked as EOG
llm_load_vocab: control token: 128076 '<|reserved_special_token_71|>' is not marked as EOG
llm_load_vocab: control token: 128222 '<|reserved_special_token_217|>' is not marked as EOG
llm_load_vocab: control token: 128172 '<|reserved_special_token_167|>' is not marked as EOG
llm_load_vocab: control token: 128220 '<|reserved_special_token_215|>' is not marked as EOG
llm_load_vocab: control token: 128179 '<|reserved_special_token_174|>' is not marked as EOG
llm_load_vocab: control token: 128065 '<|reserved_special_token_60|>' is not marked as EOG
llm_load_vocab: control token: 128249 '<|reserved_special_token_244|>' is not marked as EOG
llm_load_vocab: control token: 128175 '<|reserved_special_token_170|>' is not marked as EOG
llm_load_vocab: control token: 128218 '<|reserved_special_token_213|>' is not marked as EOG
llm_load_vocab: control token: 128049 '<|reserved_special_token_44|>' is not marked as EOG
llm_load_vocab: control token: 128119 '<|reserved_special_token_114|>' is not marked as EOG
llm_load_vocab: control token: 128148 '<|reserved_special_token_143|>' is not marked as EOG
llm_load_vocab: control token: 128118 '<|reserved_special_token_113|>' is not marked as EOG
llm_load_vocab: control token: 128155 '<|reserved_special_token_150|>' is not marked as EOG
llm_load_vocab: control token: 128093 '<|reserved_special_token_88|>' is not marked as EOG
llm_load_vocab: control token: 128087 '<|reserved_special_token_82|>' is not marked as EOG
llm_load_vocab: control token: 128235 '<|reserved_special_token_230|>' is not marked as EOG
llm_load_vocab: control token: 128136 '<|reserved_special_token_131|>' is not marked as EOG
llm_load_vocab: control token: 128173 '<|reserved_special_token_168|>' is not marked as EOG
llm_load_vocab: control token: 128074 '<|reserved_special_token_69|>' is not marked as EOG
llm_load_vocab: control token: 128211 '<|reserved_special_token_206|>' is not marked as EOG
llm_load_vocab: control token: 128168 '<|reserved_special_token_163|>' is not marked as EOG
llm_load_vocab: control token: 128109 '<|reserved_special_token_104|>' is not marked as EOG
llm_load_vocab: control token: 128177 '<|reserved_special_token_172|>' is not marked as EOG
llm_load_vocab: control token: 128057 '<|reserved_special_token_52|>' is not marked as EOG
llm_load_vocab: control token: 128000 '<|begin_of_text|>' is not marked as EOG
llm_load_vocab: control token: 128149 '<|reserved_special_token_144|>' is not marked as EOG
llm_load_vocab: control token: 128113 '<|reserved_special_token_108|>' is not marked as EOG
llm_load_vocab: control token: 128069 '<|reserved_special_token_64|>' is not marked as EOG
llm_load_vocab: control token: 128056 '<|reserved_special_token_51|>' is not marked as EOG
llm_load_vocab: control token: 128091 '<|reserved_special_token_86|>' is not marked as EOG
llm_load_vocab: control token: 128184 '<|reserved_special_token_179|>' is not marked as EOG
llm_load_vocab: control token: 128100 '<|reserved_special_token_95|>' is not marked as EOG
llm_load_vocab: control token: 128124 '<|reserved_special_token_119|>' is not marked as EOG
llm_load_vocab: control token: 128020 '<|reserved_special_token_15|>' is not marked as EOG
llm_load_vocab: control token: 128034 '<|reserved_special_token_29|>' is not marked as EOG
llm_load_vocab: control token: 128225 '<|reserved_special_token_220|>' is not marked as EOG
llm_load_vocab: control token: 128002 '<|reserved_special_token_0|>' is not marked as EOG
llm_load_vocab: control token: 128088 '<|reserved_special_token_83|>' is not marked as EOG
llm_load_vocab: control token: 128041 '<|reserved_special_token_36|>' is not marked as EOG
llm_load_vocab: control token: 128215 '<|reserved_special_token_210|>' is not marked as EOG
llm_load_vocab: control token: 128208 '<|reserved_special_token_203|>' is not marked as EOG
llm_load_vocab: control token: 128070 '<|reserved_special_token_65|>' is not marked as EOG
llm_load_vocab: control token: 128165 '<|reserved_special_token_160|>' is not marked as EOG
llm_load_vocab: control token: 128180 '<|reserved_special_token_175|>' is not marked as EOG
llm_load_vocab: control token: 128231 '<|reserved_special_token_226|>' is not marked as EOG
llm_load_vocab: control token: 128232 '<|reserved_special_token_227|>' is not marked as EOG
llm_load_vocab: control token: 128064 '<|reserved_special_token_59|>' is not marked as EOG
llm_load_vocab: control token: 128036 '<|reserved_special_token_31|>' is not marked as EOG
llm_load_vocab: control token: 128103 '<|reserved_special_token_98|>' is not marked as EOG
llm_load_vocab: control token: 128247 '<|reserved_special_token_242|>' is not marked as EOG
llm_load_vocab: control token: 128170 '<|reserved_special_token_165|>' is not marked as EOG
llm_load_vocab: control token: 128123 '<|reserved_special_token_118|>' is not marked as EOG
llm_load_vocab: control token: 128044 '<|reserved_special_token_39|>' is not marked as EOG
llm_load_vocab: control token: 128237 '<|reserved_special_token_232|>' is not marked as EOG
llm_load_vocab: control token: 128042 '<|reserved_special_token_37|>' is not marked as EOG
llm_load_vocab: control token: 128192 '<|reserved_special_token_187|>' is not marked as EOG
llm_load_vocab: control token: 128075 '<|reserved_special_token_70|>' is not marked as EOG
llm_load_vocab: control token: 128134 '<|reserved_special_token_129|>' is not marked as EOG
llm_load_vocab: control token: 128183 '<|reserved_special_token_178|>' is not marked as EOG
llm_load_vocab: control token: 128045 '<|reserved_special_token_40|>' is not marked as EOG
llm_load_vocab: control token: 128073 '<|reserved_special_token_68|>' is not marked as EOG
llm_load_vocab: control token: 128026 '<|reserved_special_token_21|>' is not marked as EOG
llm_load_vocab: control token: 128010 '<|reserved_special_token_5|>' is not marked as EOG
llm_load_vocab: control token: 128194 '<|reserved_special_token_189|>' is not marked as EOG
llm_load_vocab: control token: 128053 '<|reserved_special_token_48|>' is not marked as EOG
llm_load_vocab: control token: 128120 '<|reserved_special_token_115|>' is not marked as EOG
llm_load_vocab: control token: 128092 '<|reserved_special_token_87|>' is not marked as EOG
llm_load_vocab: control token: 128086 '<|reserved_special_token_81|>' is not marked as EOG
llm_load_vocab: control token: 128054 '<|reserved_special_token_49|>' is not marked as EOG
llm_load_vocab: control token: 128160 '<|reserved_special_token_155|>' is not marked as EOG
llm_load_vocab: control token: 128005 '<|reserved_special_token_3|>' is not marked as EOG
llm_load_vocab: control token: 128050 '<|reserved_special_token_45|>' is not marked as EOG
llm_load_vocab: control token: 128157 '<|reserved_special_token_152|>' is not marked as EOG
llm_load_vocab: control token: 128219 '<|reserved_special_token_214|>' is not marked as EOG
llm_load_vocab: control token: 128032 '<|reserved_special_token_27|>' is not marked as EOG
llm_load_vocab: control token: 128159 '<|reserved_special_token_154|>' is not marked as EOG
llm_load_vocab: control token: 128202 '<|reserved_special_token_197|>' is not marked as EOG
llm_load_vocab: control token: 128106 '<|reserved_special_token_101|>' is not marked as EOG
llm_load_vocab: control token: 128182 '<|reserved_special_token_177|>' is not marked as EOG
llm_load_vocab: control token: 128111 '<|reserved_special_token_106|>' is not marked as EOG
llm_load_vocab: control token: 128156 '<|reserved_special_token_151|>' is not marked as EOG
llm_load_vocab: control token: 128176 '<|reserved_special_token_171|>' is not marked as EOG
llm_load_vocab: control token: 128112 '<|reserved_special_token_107|>' is not marked as EOG
llm_load_vocab: control token: 128084 '<|reserved_special_token_79|>' is not marked as EOG
llm_load_vocab: control token: 128110 '<|reserved_special_token_105|>' is not marked as EOG
llm_load_vocab: control token: 128051 '<|reserved_special_token_46|>' is not marked as EOG
llm_load_vocab: control token: 128027 '<|reserved_special_token_22|>' is not marked as EOG
llm_load_vocab: control token: 128167 '<|reserved_special_token_162|>' is not marked as EOG
llm_load_vocab: control token: 128008 '<|reserved_special_token_4|>' is not marked as EOG
llm_load_vocab: control token: 128061 '<|reserved_special_token_56|>' is not marked as EOG
llm_load_vocab: control token: 128115 '<|reserved_special_token_110|>' is not marked as EOG
llm_load_vocab: control token: 128203 '<|reserved_special_token_198|>' is not marked as EOG
llm_load_vocab: control token: 128096 '<|reserved_special_token_91|>' is not marked as EOG
llm_load_vocab: control token: 128130 '<|reserved_special_token_125|>' is not marked as EOG
llm_load_vocab: control token: 128187 '<|reserved_special_token_182|>' is not marked as EOG
llm_load_vocab: control token: 128094 '<|reserved_special_token_89|>' is not marked as EOG
llm_load_vocab: control token: 128083 '<|reserved_special_token_78|>' is not marked as EOG
llm_load_vocab: control token: 128117 '<|reserved_special_token_112|>' is not marked as EOG
llm_load_vocab: control token: 128190 '<|reserved_special_token_185|>' is not marked as EOG
llm_load_vocab: control token: 128046 '<|reserved_special_token_41|>' is not marked as EOG
llm_load_vocab: control token: 128239 '<|reserved_special_token_234|>' is not marked as EOG
llm_load_vocab: control token: 128139 '<|reserved_special_token_134|>' is not marked as EOG
llm_load_vocab: control token: 128185 '<|reserved_special_token_180|>' is not marked as EOG
llm_load_vocab: control token: 128141 '<|reserved_special_token_136|>' is not marked as EOG
llm_load_vocab: control token: 128244 '<|reserved_special_token_239|>' is not marked as EOG
llm_load_vocab: control token: 128062 '<|reserved_special_token_57|>' is not marked as EOG
llm_load_vocab: control token: 128114 '<|reserved_special_token_109|>' is not marked as EOG
llm_load_vocab: control token: 128030 '<|reserved_special_token_25|>' is not marked as EOG
llm_load_vocab: control token: 128181 '<|reserved_special_token_176|>' is not marked as EOG
llm_load_vocab: control token: 128037 '<|reserved_special_token_32|>' is not marked as EOG
llm_load_vocab: control token: 128201 '<|reserved_special_token_196|>' is not marked as EOG
llm_load_vocab: control token: 128207 '<|reserved_special_token_202|>' is not marked as EOG
llm_load_vocab: control token: 128242 '<|reserved_special_token_237|>' is not marked as EOG
llm_load_vocab: control token: 128132 '<|reserved_special_token_127|>' is not marked as EOG
llm_load_vocab: control token: 128068 '<|reserved_special_token_63|>' is not marked as EOG
llm_load_vocab: control token: 128150 '<|reserved_special_token_145|>' is not marked as EOG
llm_load_vocab: control token: 128191 '<|reserved_special_token_186|>' is not marked as EOG
llm_load_vocab: control token: 128174 '<|reserved_special_token_169|>' is not marked as EOG
llm_load_vocab: control token: 128233 '<|reserved_special_token_228|>' is not marked as EOG
llm_load_vocab: control token: 128245 '<|reserved_special_token_240|>' is not marked as EOG
llm_load_vocab: control token: 128238 '<|reserved_special_token_233|>' is not marked as EOG
llm_load_vocab: control token: 128209 '<|reserved_special_token_204|>' is not marked as EOG
llm_load_vocab: control token: 128204 '<|reserved_special_token_199|>' is not marked as EOG
llm_load_vocab: control token: 128001 '<|end_of_text|>' is not marked as EOG
llm_load_vocab: control token: 128003 '<|reserved_special_token_1|>' is not marked as EOG
llm_load_vocab: control token: 128004 '<|reserved_special_token_2|>' is not marked as EOG
llm_load_vocab: control token: 128011 '<|reserved_special_token_6|>' is not marked as EOG
llm_load_vocab: control token: 128012 '<|reserved_special_token_7|>' is not marked as EOG
llm_load_vocab: control token: 128013 '<|reserved_special_token_8|>' is not marked as EOG
llm_load_vocab: control token: 128015 '<|reserved_special_token_10|>' is not marked as EOG
llm_load_vocab: control token: 128016 '<|reserved_special_token_11|>' is not marked as EOG
llm_load_vocab: control token: 128017 '<|reserved_special_token_12|>' is not marked as EOG
llm_load_vocab: control token: 128018 '<|reserved_special_token_13|>' is not marked as EOG
llm_load_vocab: control token: 128019 '<|reserved_special_token_14|>' is not marked as EOG
llm_load_vocab: control token: 128021 '<|reserved_special_token_16|>' is not marked as EOG
llm_load_vocab: control token: 128022 '<|reserved_special_token_17|>' is not marked as EOG
llm_load_vocab: control token: 128023 '<|reserved_special_token_18|>' is not marked as EOG
llm_load_vocab: control token: 128024 '<|reserved_special_token_19|>' is not marked as EOG
llm_load_vocab: control token: 128025 '<|reserved_special_token_20|>' is not marked as EOG
llm_load_vocab: control token: 128029 '<|reserved_special_token_24|>' is not marked as EOG
llm_load_vocab: control token: 128033 '<|reserved_special_token_28|>' is not marked as EOG
llm_load_vocab: control token: 128035 '<|reserved_special_token_30|>' is not marked as EOG
llm_load_vocab: control token: 128038 '<|reserved_special_token_33|>' is not marked as EOG
llm_load_vocab: control token: 128039 '<|reserved_special_token_34|>' is not marked as EOG
llm_load_vocab: control token: 128040 '<|reserved_special_token_35|>' is not marked as EOG
llm_load_vocab: control token: 128043 '<|reserved_special_token_38|>' is not marked as EOG
llm_load_vocab: control token: 128048 '<|reserved_special_token_43|>' is not marked as EOG
llm_load_vocab: control token: 128052 '<|reserved_special_token_47|>' is not marked as EOG
llm_load_vocab: control token: 128055 '<|reserved_special_token_50|>' is not marked as EOG
llm_load_vocab: control token: 128058 '<|reserved_special_token_53|>' is not marked as EOG
llm_load_vocab: control token: 128059 '<|reserved_special_token_54|>' is not marked as EOG
llm_load_vocab: control token: 128060 '<|reserved_special_token_55|>' is not marked as EOG
llm_load_vocab: control token: 128063 '<|reserved_special_token_58|>' is not marked as EOG
llm_load_vocab: control token: 128066 '<|reserved_special_token_61|>' is not marked as EOG
llm_load_vocab: control token: 128067 '<|reserved_special_token_62|>' is not marked as EOG
llm_load_vocab: control token: 128071 '<|reserved_special_token_66|>' is not marked as EOG
llm_load_vocab: control token: 128072 '<|reserved_special_token_67|>' is not marked as EOG
llm_load_vocab: control token: 128077 '<|reserved_special_token_72|>' is not marked as EOG
llm_load_vocab: control token: 128078 '<|reserved_special_token_73|>' is not marked as EOG
llm_load_vocab: control token: 128080 '<|reserved_special_token_75|>' is not marked as EOG
llm_load_vocab: control token: 128081 '<|reserved_special_token_76|>' is not marked as EOG
llm_load_vocab: control token: 128082 '<|reserved_special_token_77|>' is not marked as EOG
llm_load_vocab: control token: 128085 '<|reserved_special_token_80|>' is not marked as EOG
llm_load_vocab: control token: 128089 '<|reserved_special_token_84|>' is not marked as EOG
llm_load_vocab: control token: 128090 '<|reserved_special_token_85|>' is not marked as EOG
llm_load_vocab: control token: 128097 '<|reserved_special_token_92|>' is not marked as EOG
llm_load_vocab: control token: 128099 '<|reserved_special_token_94|>' is not marked as EOG
llm_load_vocab: control token: 128101 '<|reserved_special_token_96|>' is not marked as EOG
llm_load_vocab: control token: 128102 '<|reserved_special_token_97|>' is not marked as EOG
llm_load_vocab: control token: 128105 '<|reserved_special_token_100|>' is not marked as EOG
llm_load_vocab: control token: 128107 '<|reserved_special_token_102|>' is not marked as EOG
llm_load_vocab: control token: 128108 '<|reserved_special_token_103|>' is not marked as EOG
llm_load_vocab: control token: 128121 '<|reserved_special_token_116|>' is not marked as EOG
llm_load_vocab: control token: 128122 '<|reserved_special_token_117|>' is not marked as EOG
llm_load_vocab: control token: 128125 '<|reserved_special_token_120|>' is not marked as EOG
llm_load_vocab: control token: 128126 '<|reserved_special_token_121|>' is not marked as EOG
llm_load_vocab: control token: 128128 '<|reserved_special_token_123|>' is not marked as EOG
llm_load_vocab: control token: 128129 '<|reserved_special_token_124|>' is not marked as EOG
llm_load_vocab: control token: 128131 '<|reserved_special_token_126|>' is not marked as EOG
llm_load_vocab: control token: 128137 '<|reserved_special_token_132|>' is not marked as EOG
llm_load_vocab: control token: 128138 '<|reserved_special_token_133|>' is not marked as EOG
llm_load_vocab: control token: 128140 '<|reserved_special_token_135|>' is not marked as EOG
llm_load_vocab: control token: 128142 '<|reserved_special_token_137|>' is not marked as EOG
llm_load_vocab: control token: 128143 '<|reserved_special_token_138|>' is not marked as EOG
llm_load_vocab: control token: 128144 '<|reserved_special_token_139|>' is not marked as EOG
llm_load_vocab: control token: 128146 '<|reserved_special_token_141|>' is not marked as EOG
llm_load_vocab: control token: 128147 '<|reserved_special_token_142|>' is not marked as EOG
llm_load_vocab: control token: 128151 '<|reserved_special_token_146|>' is not marked as EOG
llm_load_vocab: control token: 128153 '<|reserved_special_token_148|>' is not marked as EOG
llm_load_vocab: control token: 128154 '<|reserved_special_token_149|>' is not marked as EOG
llm_load_vocab: control token: 128161 '<|reserved_special_token_156|>' is not marked as EOG
llm_load_vocab: control token: 128163 '<|reserved_special_token_158|>' is not marked as EOG
llm_load_vocab: control token: 128164 '<|reserved_special_token_159|>' is not marked as EOG
llm_load_vocab: control token: 128166 '<|reserved_special_token_161|>' is not marked as EOG
llm_load_vocab: control token: 128169 '<|reserved_special_token_164|>' is not marked as EOG
llm_load_vocab: control token: 128171 '<|reserved_special_token_166|>' is not marked as EOG
llm_load_vocab: control token: 128186 '<|reserved_special_token_181|>' is not marked as EOG
llm_load_vocab: control token: 128189 '<|reserved_special_token_184|>' is not marked as EOG
llm_load_vocab: control token: 128196 '<|reserved_special_token_191|>' is not marked as EOG
llm_load_vocab: control token: 128197 '<|reserved_special_token_192|>' is not marked as EOG
llm_load_vocab: control token: 128198 '<|reserved_special_token_193|>' is not marked as EOG
llm_load_vocab: control token: 128199 '<|reserved_special_token_194|>' is not marked as EOG
llm_load_vocab: control token: 128200 '<|reserved_special_token_195|>' is not marked as EOG
llm_load_vocab: control token: 128205 '<|reserved_special_token_200|>' is not marked as EOG
llm_load_vocab: control token: 128210 '<|reserved_special_token_205|>' is not marked as EOG
llm_load_vocab: control token: 128212 '<|reserved_special_token_207|>' is not marked as EOG
llm_load_vocab: control token: 128214 '<|reserved_special_token_209|>' is not marked as EOG
llm_load_vocab: control token: 128216 '<|reserved_special_token_211|>' is not marked as EOG
llm_load_vocab: control token: 128217 '<|reserved_special_token_212|>' is not marked as EOG
llm_load_vocab: control token: 128223 '<|reserved_special_token_218|>' is not marked as EOG
llm_load_vocab: control token: 128224 '<|reserved_special_token_219|>' is not marked as EOG
llm_load_vocab: control token: 128227 '<|reserved_special_token_222|>' is not marked as EOG
llm_load_vocab: control token: 128228 '<|reserved_special_token_223|>' is not marked as EOG
llm_load_vocab: control token: 128229 '<|reserved_special_token_224|>' is not marked as EOG
llm_load_vocab: control token: 128230 '<|reserved_special_token_225|>' is not marked as EOG
llm_load_vocab: control token: 128234 '<|reserved_special_token_229|>' is not marked as EOG
llm_load_vocab: control token: 128236 '<|reserved_special_token_231|>' is not marked as EOG
llm_load_vocab: control token: 128240 '<|reserved_special_token_235|>' is not marked as EOG
llm_load_vocab: control token: 128248 '<|reserved_special_token_243|>' is not marked as EOG
llm_load_vocab: control token: 128250 '<|reserved_special_token_245|>' is not marked as EOG
llm_load_vocab: control token: 128251 '<|reserved_special_token_246|>' is not marked as EOG
llm_load_vocab: control token: 128252 '<|reserved_special_token_247|>' is not marked as EOG
llm_load_vocab: control token: 128253 '<|reserved_special_token_248|>' is not marked as EOG
llm_load_vocab: control token: 128254 '<|reserved_special_token_249|>' is not marked as EOG
llm_load_vocab: control token: 128255 '<|reserved_special_token_250|>' is not marked as EOG
llm_load_vocab: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
llm_load_vocab: special tokens cache size = 256
llm_load_vocab: token to piece cache size = 0.8000 MB
llm_load_print_meta: format = GGUF V3 (latest)
llm_load_print_meta: arch = llama
llm_load_print_meta: vocab type = BPE
llm_load_print_meta: n_vocab = 128256
llm_load_print_meta: n_merges = 280147
llm_load_print_meta: vocab_only = 0
llm_load_print_meta: n_ctx_train = 8192
llm_load_print_meta: n_embd = 4096
llm_load_print_meta: n_layer = 32
llm_load_print_meta: n_head = 32
llm_load_print_meta: n_head_kv = 8
llm_load_print_meta: n_rot = 128
llm_load_print_meta: n_swa = 0
llm_load_print_meta: n_embd_head_k = 128
llm_load_print_meta: n_embd_head_v = 128
llm_load_print_meta: n_gqa = 4
llm_load_print_meta: n_embd_k_gqa = 1024
llm_load_print_meta: n_embd_v_gqa = 1024
llm_load_print_meta: f_norm_eps = 0.0e+00
llm_load_print_meta: f_norm_rms_eps = 1.0e-05
llm_load_print_meta: f_clamp_kqv = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale = 0.0e+00
llm_load_print_meta: n_ff = 14336
llm_load_print_meta: n_expert = 0
llm_load_print_meta: n_expert_used = 0
llm_load_print_meta: causal attn = 1
llm_load_print_meta: pooling type = 0
llm_load_print_meta: rope type = 0
llm_load_print_meta: rope scaling = linear
llm_load_print_meta: freq_base_train = 500000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_ctx_orig_yarn = 8192
llm_load_print_meta: rope_finetuned = unknown
llm_load_print_meta: ssm_d_conv = 0
llm_load_print_meta: ssm_d_inner = 0
llm_load_print_meta: ssm_d_state = 0
llm_load_print_meta: ssm_dt_rank = 0
llm_load_print_meta: ssm_dt_b_c_rms = 0
llm_load_print_meta: model type = 8B
llm_load_print_meta: model ftype = Q8_0
llm_load_print_meta: model params = 8.03 B
llm_load_print_meta: model size = 7.95 GiB (8.50 BPW)
llm_load_print_meta: general.name = Llama-3-Smaug-8B
llm_load_print_meta: BOS token = 128000 '<|begin_of_text|>'
llm_load_print_meta: EOS token = 128001 '<|end_of_text|>'
llm_load_print_meta: EOT token = 128009 '<|eot_id|>'
llm_load_print_meta: PAD token = 128001 '<|end_of_text|>'
llm_load_print_meta: LF token = 128 ' '
llm_load_print_meta: EOG token = 128001 '<|end_of_text|>'
llm_load_print_meta: EOG token = 128009 '<|eot_id|>'
llm_load_print_meta: max token length = 256
llm_load_tensors: tensor 'token_embd.weight' (q8_0) (and 0 others) cannot be used with preferred buffer type CPU_AARCH64, using CPU instead
llm_load_tensors: offloading 32 repeating layers to GPU
llm_load_tensors: offloading output layer to GPU
llm_load_tensors: offloaded 33/33 layers to GPU
llm_load_tensors: CPU_Mapped model buffer size = 532.31 MiB
llm_load_tensors: CUDA0 model buffer size = 3757.53 MiB
llm_load_tensors: CUDA1 model buffer size = 3847.80 MiB
.........................................................................................
llama_new_context_with_model: n_seq_max = 1
llama_new_context_with_model: n_ctx = 8192
llama_new_context_with_model: n_ctx_per_seq = 8192
llama_new_context_with_model: n_batch = 512
llama_new_context_with_model: n_ubatch = 512
llama_new_context_with_model: flash_attn = 0
llama_new_context_with_model: freq_base = 500000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init: CUDA0 KV buffer size = 544.00 MiB
llama_kv_cache_init: CUDA1 KV buffer size = 480.00 MiB
llama_new_context_with_model: KV self size = 1024.00 MiB, K (f16): 512.00 MiB, V (f16): 512.00 MiB
llama_new_context_with_model: CUDA_Host output buffer size = 0.49 MiB
llama_new_context_with_model: pipeline parallelism enabled (n_copies=4)
llama_new_context_with_model: CUDA0 compute buffer size = 640.01 MiB
llama_new_context_with_model: CUDA1 compute buffer size = 640.02 MiB
llama_new_context_with_model: CUDA_Host compute buffer size = 72.02 MiB
llama_new_context_with_model: graph nodes = 1030
llama_new_context_with_model: graph splits = 3
AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | AMX_INT8 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | RISCV_VECT = 0 | WASM_SIMD = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 |
Model metadata: {'general.name': 'Llama-3-Smaug-8B', 'general.architecture': 'llama', 'llama.block_count': '32', 'llama.context_length': '8192', 'tokenizer.ggml.eos_token_id': '128001', 'general.file_type': '7', 'llama.attention.head_count_kv': '8', 'llama.embedding_length': '4096', 'llama.feed_forward_length': '14336', 'llama.attention.head_count': '32', 'llama.rope.freq_base': '500000.000000', 'llama.attention.layer_norm_rms_epsilon': '0.000010', 'llama.vocab_size': '128256', 'llama.rope.dimension_count': '128', 'tokenizer.ggml.model': 'gpt2', 'general.quantization_version': '2', 'tokenizer.ggml.bos_token_id': '128000', 'tokenizer.ggml.padding_token_id': '128001', 'tokenizer.chat_template': "{% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}{{ content }}{% endfor %}{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}{% endif %}"}
Available chat formats from metadata: chat_template.default
Guessed chat format: llama-3
10:55:35.527 [INFO ] private_gpt.components.embedding.embedding_component - Initializing the embedding model in mode=huggingface
10:55:35.902 [INFO ] sentence_transformers.SentenceTransformer - Load pretrained SentenceTransformer: nomic-ai/nomic-embed-text-v1.5
10:55:36.840 [WARNING ] transformers_modules.nomic-ai.nomic-bert-2048.eb02ceb48c1fdcc477ff1925c9732c379f0f0d1f.modeling_hf_nomic_bert -
10:55:36.902 [INFO ] sentence_transformers.SentenceTransformer - 2 prompts are loaded, with the keys: ['query', 'text']
10:56:07.999 [INFO ] llama_index.core.indices.loading - Loading all indices.
10:56:22.796 [INFO ] private_gpt.ui.ui - Mounting the gradio UI, at path=/
10:56:22.843 [INFO ] uvicorn.error - Started server process [2224]
10:56:22.843 [INFO ] uvicorn.error - Waiting for application startup.
10:56:22.843 [INFO ] uvicorn.error - Application startup complete.
10:56:22.843 [INFO ] uvicorn.error - Uvicorn running on http://0.0.0.0:8001 (Press CTRL+C to quit)
The text was updated successfully, but these errors were encountered:
Question
poetry run python -m private_gpt
10:51:41.232 [INFO ] private_gpt.settings.settings_loader - Starting application with profiles=['default', 'local']
10:51:46.763 [INFO ] private_gpt.components.llm.llm_component - Tokenizer successfully initialized with pre-tokenizer settings.
10:51:48.248 [INFO ] private_gpt.components.llm.llm_component - Attempting to initialize tokenizer from C:\SharedTools\YODA\private-gpt\models\Tokenizer
10:51:48.451 [INFO ] private_gpt.components.llm.llm_component - Initializing the LLM in mode=llamacpp
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 2 CUDA devices:
Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes
Device 1: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes
llama_load_model_from_file: using device CUDA0 (NVIDIA GeForce RTX 4090) - 22994 MiB free
llama_load_model_from_file: using device CUDA1 (NVIDIA GeForce RTX 4090) - 22994 MiB free
llama_model_loader: loaded meta data with 22 key-value pairs and 291 tensors from C:\SharedTools\YODA\private-gpt\models\Llama-3-Smaug-8B-Q8_0.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = llama
llama_model_loader: - kv 1: general.name str = Llama-3-Smaug-8B
llama_model_loader: - kv 2: llama.block_count u32 = 32
llama_model_loader: - kv 3: llama.context_length u32 = 8192
llama_model_loader: - kv 4: llama.embedding_length u32 = 4096
llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336
llama_model_loader: - kv 6: llama.attention.head_count u32 = 32
llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8
llama_model_loader: - kv 8: llama.rope.freq_base f32 = 500000.000000
llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010
llama_model_loader: - kv 10: general.file_type u32 = 7
llama_model_loader: - kv 11: llama.vocab_size u32 = 128256
llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128
llama_model_loader: - kv 13: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 14: tokenizer.ggml.tokens arr[str,128256] = ["!", """, "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 15: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 16: tokenizer.ggml.merges arr[str,280147] = ["\u0120 \u0120", "\u0120 \u0120\u0120\u0120", "\u0120\u0120 \u0120\u0120", "...
llama_model_loader: - kv 17: tokenizer.ggml.bos_token_id u32 = 128000
llama_model_loader: - kv 18: tokenizer.ggml.eos_token_id u32 = 128001
llama_model_loader: - kv 19: tokenizer.ggml.padding_token_id u32 = 128001
llama_model_loader: - kv 20: tokenizer.chat_template str = {% set loop_messages = messages %}{% ...
llama_model_loader: - kv 21: general.quantization_version u32 = 2
llama_model_loader: - type f32: 65 tensors
llama_model_loader: - type q8_0: 226 tensors
llm_load_vocab: missing pre-tokenizer type, using: 'default'
llm_load_vocab:
llm_load_vocab: ************************************
llm_load_vocab: GENERATION QUALITY WILL BE DEGRADED!
llm_load_vocab: CONSIDER REGENERATING THE MODEL
llm_load_vocab: ************************************
llm_load_vocab:
llm_load_vocab: control-looking token: 128009 '<|eot_id|>' was not control-type; this is probably a bug in the model. its type will be overridden
llm_load_vocab: control token: 128095 '<|reserved_special_token_90|>' is not marked as EOG
llm_load_vocab: control token: 128188 '<|reserved_special_token_183|>' is not marked as EOG
llm_load_vocab: control token: 128127 '<|reserved_special_token_122|>' is not marked as EOG
llm_load_vocab: control token: 128116 '<|reserved_special_token_111|>' is not marked as EOG
llm_load_vocab: control token: 128133 '<|reserved_special_token_128|>' is not marked as EOG
llm_load_vocab: control token: 128152 '<|reserved_special_token_147|>' is not marked as EOG
llm_load_vocab: control token: 128193 '<|reserved_special_token_188|>' is not marked as EOG
llm_load_vocab: control token: 128098 '<|reserved_special_token_93|>' is not marked as EOG
llm_load_vocab: control token: 128135 '<|reserved_special_token_130|>' is not marked as EOG
llm_load_vocab: control token: 128178 '<|reserved_special_token_173|>' is not marked as EOG
llm_load_vocab: control token: 128031 '<|reserved_special_token_26|>' is not marked as EOG
llm_load_vocab: control token: 128206 '<|reserved_special_token_201|>' is not marked as EOG
llm_load_vocab: control token: 128028 '<|reserved_special_token_23|>' is not marked as EOG
llm_load_vocab: control token: 128047 '<|reserved_special_token_42|>' is not marked as EOG
llm_load_vocab: control token: 128241 '<|reserved_special_token_236|>' is not marked as EOG
llm_load_vocab: control token: 128145 '<|reserved_special_token_140|>' is not marked as EOG
llm_load_vocab: control token: 128195 '<|reserved_special_token_190|>' is not marked as EOG
llm_load_vocab: control token: 128226 '<|reserved_special_token_221|>' is not marked as EOG
llm_load_vocab: control token: 128162 '<|reserved_special_token_157|>' is not marked as EOG
llm_load_vocab: control token: 128243 '<|reserved_special_token_238|>' is not marked as EOG
llm_load_vocab: control token: 128014 '<|reserved_special_token_9|>' is not marked as EOG
llm_load_vocab: control token: 128213 '<|reserved_special_token_208|>' is not marked as EOG
llm_load_vocab: control token: 128158 '<|reserved_special_token_153|>' is not marked as EOG
llm_load_vocab: control token: 128221 '<|reserved_special_token_216|>' is not marked as EOG
llm_load_vocab: control token: 128079 '<|reserved_special_token_74|>' is not marked as EOG
llm_load_vocab: control token: 128246 '<|reserved_special_token_241|>' is not marked as EOG
llm_load_vocab: control token: 128104 '<|reserved_special_token_99|>' is not marked as EOG
llm_load_vocab: control token: 128076 '<|reserved_special_token_71|>' is not marked as EOG
llm_load_vocab: control token: 128222 '<|reserved_special_token_217|>' is not marked as EOG
llm_load_vocab: control token: 128172 '<|reserved_special_token_167|>' is not marked as EOG
llm_load_vocab: control token: 128220 '<|reserved_special_token_215|>' is not marked as EOG
llm_load_vocab: control token: 128179 '<|reserved_special_token_174|>' is not marked as EOG
llm_load_vocab: control token: 128065 '<|reserved_special_token_60|>' is not marked as EOG
llm_load_vocab: control token: 128249 '<|reserved_special_token_244|>' is not marked as EOG
llm_load_vocab: control token: 128175 '<|reserved_special_token_170|>' is not marked as EOG
llm_load_vocab: control token: 128218 '<|reserved_special_token_213|>' is not marked as EOG
llm_load_vocab: control token: 128049 '<|reserved_special_token_44|>' is not marked as EOG
llm_load_vocab: control token: 128119 '<|reserved_special_token_114|>' is not marked as EOG
llm_load_vocab: control token: 128148 '<|reserved_special_token_143|>' is not marked as EOG
llm_load_vocab: control token: 128118 '<|reserved_special_token_113|>' is not marked as EOG
llm_load_vocab: control token: 128155 '<|reserved_special_token_150|>' is not marked as EOG
llm_load_vocab: control token: 128093 '<|reserved_special_token_88|>' is not marked as EOG
llm_load_vocab: control token: 128087 '<|reserved_special_token_82|>' is not marked as EOG
llm_load_vocab: control token: 128235 '<|reserved_special_token_230|>' is not marked as EOG
llm_load_vocab: control token: 128136 '<|reserved_special_token_131|>' is not marked as EOG
llm_load_vocab: control token: 128173 '<|reserved_special_token_168|>' is not marked as EOG
llm_load_vocab: control token: 128074 '<|reserved_special_token_69|>' is not marked as EOG
llm_load_vocab: control token: 128211 '<|reserved_special_token_206|>' is not marked as EOG
llm_load_vocab: control token: 128168 '<|reserved_special_token_163|>' is not marked as EOG
llm_load_vocab: control token: 128109 '<|reserved_special_token_104|>' is not marked as EOG
llm_load_vocab: control token: 128177 '<|reserved_special_token_172|>' is not marked as EOG
llm_load_vocab: control token: 128057 '<|reserved_special_token_52|>' is not marked as EOG
llm_load_vocab: control token: 128000 '<|begin_of_text|>' is not marked as EOG
llm_load_vocab: control token: 128149 '<|reserved_special_token_144|>' is not marked as EOG
llm_load_vocab: control token: 128113 '<|reserved_special_token_108|>' is not marked as EOG
llm_load_vocab: control token: 128069 '<|reserved_special_token_64|>' is not marked as EOG
llm_load_vocab: control token: 128056 '<|reserved_special_token_51|>' is not marked as EOG
llm_load_vocab: control token: 128091 '<|reserved_special_token_86|>' is not marked as EOG
llm_load_vocab: control token: 128184 '<|reserved_special_token_179|>' is not marked as EOG
llm_load_vocab: control token: 128100 '<|reserved_special_token_95|>' is not marked as EOG
llm_load_vocab: control token: 128124 '<|reserved_special_token_119|>' is not marked as EOG
llm_load_vocab: control token: 128020 '<|reserved_special_token_15|>' is not marked as EOG
llm_load_vocab: control token: 128034 '<|reserved_special_token_29|>' is not marked as EOG
llm_load_vocab: control token: 128225 '<|reserved_special_token_220|>' is not marked as EOG
llm_load_vocab: control token: 128002 '<|reserved_special_token_0|>' is not marked as EOG
llm_load_vocab: control token: 128088 '<|reserved_special_token_83|>' is not marked as EOG
llm_load_vocab: control token: 128041 '<|reserved_special_token_36|>' is not marked as EOG
llm_load_vocab: control token: 128215 '<|reserved_special_token_210|>' is not marked as EOG
llm_load_vocab: control token: 128208 '<|reserved_special_token_203|>' is not marked as EOG
llm_load_vocab: control token: 128070 '<|reserved_special_token_65|>' is not marked as EOG
llm_load_vocab: control token: 128165 '<|reserved_special_token_160|>' is not marked as EOG
llm_load_vocab: control token: 128180 '<|reserved_special_token_175|>' is not marked as EOG
llm_load_vocab: control token: 128231 '<|reserved_special_token_226|>' is not marked as EOG
llm_load_vocab: control token: 128232 '<|reserved_special_token_227|>' is not marked as EOG
llm_load_vocab: control token: 128064 '<|reserved_special_token_59|>' is not marked as EOG
llm_load_vocab: control token: 128036 '<|reserved_special_token_31|>' is not marked as EOG
llm_load_vocab: control token: 128103 '<|reserved_special_token_98|>' is not marked as EOG
llm_load_vocab: control token: 128247 '<|reserved_special_token_242|>' is not marked as EOG
llm_load_vocab: control token: 128170 '<|reserved_special_token_165|>' is not marked as EOG
llm_load_vocab: control token: 128123 '<|reserved_special_token_118|>' is not marked as EOG
llm_load_vocab: control token: 128044 '<|reserved_special_token_39|>' is not marked as EOG
llm_load_vocab: control token: 128237 '<|reserved_special_token_232|>' is not marked as EOG
llm_load_vocab: control token: 128042 '<|reserved_special_token_37|>' is not marked as EOG
llm_load_vocab: control token: 128192 '<|reserved_special_token_187|>' is not marked as EOG
llm_load_vocab: control token: 128075 '<|reserved_special_token_70|>' is not marked as EOG
llm_load_vocab: control token: 128134 '<|reserved_special_token_129|>' is not marked as EOG
llm_load_vocab: control token: 128183 '<|reserved_special_token_178|>' is not marked as EOG
llm_load_vocab: control token: 128045 '<|reserved_special_token_40|>' is not marked as EOG
llm_load_vocab: control token: 128073 '<|reserved_special_token_68|>' is not marked as EOG
llm_load_vocab: control token: 128026 '<|reserved_special_token_21|>' is not marked as EOG
llm_load_vocab: control token: 128010 '<|reserved_special_token_5|>' is not marked as EOG
llm_load_vocab: control token: 128194 '<|reserved_special_token_189|>' is not marked as EOG
llm_load_vocab: control token: 128053 '<|reserved_special_token_48|>' is not marked as EOG
llm_load_vocab: control token: 128120 '<|reserved_special_token_115|>' is not marked as EOG
llm_load_vocab: control token: 128092 '<|reserved_special_token_87|>' is not marked as EOG
llm_load_vocab: control token: 128086 '<|reserved_special_token_81|>' is not marked as EOG
llm_load_vocab: control token: 128054 '<|reserved_special_token_49|>' is not marked as EOG
llm_load_vocab: control token: 128160 '<|reserved_special_token_155|>' is not marked as EOG
llm_load_vocab: control token: 128005 '<|reserved_special_token_3|>' is not marked as EOG
llm_load_vocab: control token: 128050 '<|reserved_special_token_45|>' is not marked as EOG
llm_load_vocab: control token: 128157 '<|reserved_special_token_152|>' is not marked as EOG
llm_load_vocab: control token: 128219 '<|reserved_special_token_214|>' is not marked as EOG
llm_load_vocab: control token: 128032 '<|reserved_special_token_27|>' is not marked as EOG
llm_load_vocab: control token: 128159 '<|reserved_special_token_154|>' is not marked as EOG
llm_load_vocab: control token: 128202 '<|reserved_special_token_197|>' is not marked as EOG
llm_load_vocab: control token: 128106 '<|reserved_special_token_101|>' is not marked as EOG
llm_load_vocab: control token: 128182 '<|reserved_special_token_177|>' is not marked as EOG
llm_load_vocab: control token: 128111 '<|reserved_special_token_106|>' is not marked as EOG
llm_load_vocab: control token: 128156 '<|reserved_special_token_151|>' is not marked as EOG
llm_load_vocab: control token: 128176 '<|reserved_special_token_171|>' is not marked as EOG
llm_load_vocab: control token: 128112 '<|reserved_special_token_107|>' is not marked as EOG
llm_load_vocab: control token: 128084 '<|reserved_special_token_79|>' is not marked as EOG
llm_load_vocab: control token: 128110 '<|reserved_special_token_105|>' is not marked as EOG
llm_load_vocab: control token: 128051 '<|reserved_special_token_46|>' is not marked as EOG
llm_load_vocab: control token: 128027 '<|reserved_special_token_22|>' is not marked as EOG
llm_load_vocab: control token: 128167 '<|reserved_special_token_162|>' is not marked as EOG
llm_load_vocab: control token: 128008 '<|reserved_special_token_4|>' is not marked as EOG
llm_load_vocab: control token: 128061 '<|reserved_special_token_56|>' is not marked as EOG
llm_load_vocab: control token: 128115 '<|reserved_special_token_110|>' is not marked as EOG
llm_load_vocab: control token: 128203 '<|reserved_special_token_198|>' is not marked as EOG
llm_load_vocab: control token: 128096 '<|reserved_special_token_91|>' is not marked as EOG
llm_load_vocab: control token: 128130 '<|reserved_special_token_125|>' is not marked as EOG
llm_load_vocab: control token: 128187 '<|reserved_special_token_182|>' is not marked as EOG
llm_load_vocab: control token: 128094 '<|reserved_special_token_89|>' is not marked as EOG
llm_load_vocab: control token: 128083 '<|reserved_special_token_78|>' is not marked as EOG
llm_load_vocab: control token: 128117 '<|reserved_special_token_112|>' is not marked as EOG
llm_load_vocab: control token: 128190 '<|reserved_special_token_185|>' is not marked as EOG
llm_load_vocab: control token: 128046 '<|reserved_special_token_41|>' is not marked as EOG
llm_load_vocab: control token: 128239 '<|reserved_special_token_234|>' is not marked as EOG
llm_load_vocab: control token: 128139 '<|reserved_special_token_134|>' is not marked as EOG
llm_load_vocab: control token: 128185 '<|reserved_special_token_180|>' is not marked as EOG
llm_load_vocab: control token: 128141 '<|reserved_special_token_136|>' is not marked as EOG
llm_load_vocab: control token: 128244 '<|reserved_special_token_239|>' is not marked as EOG
llm_load_vocab: control token: 128062 '<|reserved_special_token_57|>' is not marked as EOG
llm_load_vocab: control token: 128114 '<|reserved_special_token_109|>' is not marked as EOG
llm_load_vocab: control token: 128030 '<|reserved_special_token_25|>' is not marked as EOG
llm_load_vocab: control token: 128181 '<|reserved_special_token_176|>' is not marked as EOG
llm_load_vocab: control token: 128037 '<|reserved_special_token_32|>' is not marked as EOG
llm_load_vocab: control token: 128201 '<|reserved_special_token_196|>' is not marked as EOG
llm_load_vocab: control token: 128207 '<|reserved_special_token_202|>' is not marked as EOG
llm_load_vocab: control token: 128242 '<|reserved_special_token_237|>' is not marked as EOG
llm_load_vocab: control token: 128132 '<|reserved_special_token_127|>' is not marked as EOG
llm_load_vocab: control token: 128068 '<|reserved_special_token_63|>' is not marked as EOG
llm_load_vocab: control token: 128150 '<|reserved_special_token_145|>' is not marked as EOG
llm_load_vocab: control token: 128191 '<|reserved_special_token_186|>' is not marked as EOG
llm_load_vocab: control token: 128174 '<|reserved_special_token_169|>' is not marked as EOG
llm_load_vocab: control token: 128233 '<|reserved_special_token_228|>' is not marked as EOG
llm_load_vocab: control token: 128245 '<|reserved_special_token_240|>' is not marked as EOG
llm_load_vocab: control token: 128238 '<|reserved_special_token_233|>' is not marked as EOG
llm_load_vocab: control token: 128209 '<|reserved_special_token_204|>' is not marked as EOG
llm_load_vocab: control token: 128204 '<|reserved_special_token_199|>' is not marked as EOG
llm_load_vocab: control token: 128001 '<|end_of_text|>' is not marked as EOG
llm_load_vocab: control token: 128003 '<|reserved_special_token_1|>' is not marked as EOG
llm_load_vocab: control token: 128004 '<|reserved_special_token_2|>' is not marked as EOG
llm_load_vocab: control token: 128011 '<|reserved_special_token_6|>' is not marked as EOG
llm_load_vocab: control token: 128012 '<|reserved_special_token_7|>' is not marked as EOG
llm_load_vocab: control token: 128013 '<|reserved_special_token_8|>' is not marked as EOG
llm_load_vocab: control token: 128015 '<|reserved_special_token_10|>' is not marked as EOG
llm_load_vocab: control token: 128016 '<|reserved_special_token_11|>' is not marked as EOG
llm_load_vocab: control token: 128017 '<|reserved_special_token_12|>' is not marked as EOG
llm_load_vocab: control token: 128018 '<|reserved_special_token_13|>' is not marked as EOG
llm_load_vocab: control token: 128019 '<|reserved_special_token_14|>' is not marked as EOG
llm_load_vocab: control token: 128021 '<|reserved_special_token_16|>' is not marked as EOG
llm_load_vocab: control token: 128022 '<|reserved_special_token_17|>' is not marked as EOG
llm_load_vocab: control token: 128023 '<|reserved_special_token_18|>' is not marked as EOG
llm_load_vocab: control token: 128024 '<|reserved_special_token_19|>' is not marked as EOG
llm_load_vocab: control token: 128025 '<|reserved_special_token_20|>' is not marked as EOG
llm_load_vocab: control token: 128029 '<|reserved_special_token_24|>' is not marked as EOG
llm_load_vocab: control token: 128033 '<|reserved_special_token_28|>' is not marked as EOG
llm_load_vocab: control token: 128035 '<|reserved_special_token_30|>' is not marked as EOG
llm_load_vocab: control token: 128038 '<|reserved_special_token_33|>' is not marked as EOG
llm_load_vocab: control token: 128039 '<|reserved_special_token_34|>' is not marked as EOG
llm_load_vocab: control token: 128040 '<|reserved_special_token_35|>' is not marked as EOG
llm_load_vocab: control token: 128043 '<|reserved_special_token_38|>' is not marked as EOG
llm_load_vocab: control token: 128048 '<|reserved_special_token_43|>' is not marked as EOG
llm_load_vocab: control token: 128052 '<|reserved_special_token_47|>' is not marked as EOG
llm_load_vocab: control token: 128055 '<|reserved_special_token_50|>' is not marked as EOG
llm_load_vocab: control token: 128058 '<|reserved_special_token_53|>' is not marked as EOG
llm_load_vocab: control token: 128059 '<|reserved_special_token_54|>' is not marked as EOG
llm_load_vocab: control token: 128060 '<|reserved_special_token_55|>' is not marked as EOG
llm_load_vocab: control token: 128063 '<|reserved_special_token_58|>' is not marked as EOG
llm_load_vocab: control token: 128066 '<|reserved_special_token_61|>' is not marked as EOG
llm_load_vocab: control token: 128067 '<|reserved_special_token_62|>' is not marked as EOG
llm_load_vocab: control token: 128071 '<|reserved_special_token_66|>' is not marked as EOG
llm_load_vocab: control token: 128072 '<|reserved_special_token_67|>' is not marked as EOG
llm_load_vocab: control token: 128077 '<|reserved_special_token_72|>' is not marked as EOG
llm_load_vocab: control token: 128078 '<|reserved_special_token_73|>' is not marked as EOG
llm_load_vocab: control token: 128080 '<|reserved_special_token_75|>' is not marked as EOG
llm_load_vocab: control token: 128081 '<|reserved_special_token_76|>' is not marked as EOG
llm_load_vocab: control token: 128082 '<|reserved_special_token_77|>' is not marked as EOG
llm_load_vocab: control token: 128085 '<|reserved_special_token_80|>' is not marked as EOG
llm_load_vocab: control token: 128089 '<|reserved_special_token_84|>' is not marked as EOG
llm_load_vocab: control token: 128090 '<|reserved_special_token_85|>' is not marked as EOG
llm_load_vocab: control token: 128097 '<|reserved_special_token_92|>' is not marked as EOG
llm_load_vocab: control token: 128099 '<|reserved_special_token_94|>' is not marked as EOG
llm_load_vocab: control token: 128101 '<|reserved_special_token_96|>' is not marked as EOG
llm_load_vocab: control token: 128102 '<|reserved_special_token_97|>' is not marked as EOG
llm_load_vocab: control token: 128105 '<|reserved_special_token_100|>' is not marked as EOG
llm_load_vocab: control token: 128107 '<|reserved_special_token_102|>' is not marked as EOG
llm_load_vocab: control token: 128108 '<|reserved_special_token_103|>' is not marked as EOG
llm_load_vocab: control token: 128121 '<|reserved_special_token_116|>' is not marked as EOG
llm_load_vocab: control token: 128122 '<|reserved_special_token_117|>' is not marked as EOG
llm_load_vocab: control token: 128125 '<|reserved_special_token_120|>' is not marked as EOG
llm_load_vocab: control token: 128126 '<|reserved_special_token_121|>' is not marked as EOG
llm_load_vocab: control token: 128128 '<|reserved_special_token_123|>' is not marked as EOG
llm_load_vocab: control token: 128129 '<|reserved_special_token_124|>' is not marked as EOG
llm_load_vocab: control token: 128131 '<|reserved_special_token_126|>' is not marked as EOG
llm_load_vocab: control token: 128137 '<|reserved_special_token_132|>' is not marked as EOG
llm_load_vocab: control token: 128138 '<|reserved_special_token_133|>' is not marked as EOG
llm_load_vocab: control token: 128140 '<|reserved_special_token_135|>' is not marked as EOG
llm_load_vocab: control token: 128142 '<|reserved_special_token_137|>' is not marked as EOG
llm_load_vocab: control token: 128143 '<|reserved_special_token_138|>' is not marked as EOG
llm_load_vocab: control token: 128144 '<|reserved_special_token_139|>' is not marked as EOG
llm_load_vocab: control token: 128146 '<|reserved_special_token_141|>' is not marked as EOG
llm_load_vocab: control token: 128147 '<|reserved_special_token_142|>' is not marked as EOG
llm_load_vocab: control token: 128151 '<|reserved_special_token_146|>' is not marked as EOG
llm_load_vocab: control token: 128153 '<|reserved_special_token_148|>' is not marked as EOG
llm_load_vocab: control token: 128154 '<|reserved_special_token_149|>' is not marked as EOG
llm_load_vocab: control token: 128161 '<|reserved_special_token_156|>' is not marked as EOG
llm_load_vocab: control token: 128163 '<|reserved_special_token_158|>' is not marked as EOG
llm_load_vocab: control token: 128164 '<|reserved_special_token_159|>' is not marked as EOG
llm_load_vocab: control token: 128166 '<|reserved_special_token_161|>' is not marked as EOG
llm_load_vocab: control token: 128169 '<|reserved_special_token_164|>' is not marked as EOG
llm_load_vocab: control token: 128171 '<|reserved_special_token_166|>' is not marked as EOG
llm_load_vocab: control token: 128186 '<|reserved_special_token_181|>' is not marked as EOG
llm_load_vocab: control token: 128189 '<|reserved_special_token_184|>' is not marked as EOG
llm_load_vocab: control token: 128196 '<|reserved_special_token_191|>' is not marked as EOG
llm_load_vocab: control token: 128197 '<|reserved_special_token_192|>' is not marked as EOG
llm_load_vocab: control token: 128198 '<|reserved_special_token_193|>' is not marked as EOG
llm_load_vocab: control token: 128199 '<|reserved_special_token_194|>' is not marked as EOG
llm_load_vocab: control token: 128200 '<|reserved_special_token_195|>' is not marked as EOG
llm_load_vocab: control token: 128205 '<|reserved_special_token_200|>' is not marked as EOG
llm_load_vocab: control token: 128210 '<|reserved_special_token_205|>' is not marked as EOG
llm_load_vocab: control token: 128212 '<|reserved_special_token_207|>' is not marked as EOG
llm_load_vocab: control token: 128214 '<|reserved_special_token_209|>' is not marked as EOG
llm_load_vocab: control token: 128216 '<|reserved_special_token_211|>' is not marked as EOG
llm_load_vocab: control token: 128217 '<|reserved_special_token_212|>' is not marked as EOG
llm_load_vocab: control token: 128223 '<|reserved_special_token_218|>' is not marked as EOG
llm_load_vocab: control token: 128224 '<|reserved_special_token_219|>' is not marked as EOG
llm_load_vocab: control token: 128227 '<|reserved_special_token_222|>' is not marked as EOG
llm_load_vocab: control token: 128228 '<|reserved_special_token_223|>' is not marked as EOG
llm_load_vocab: control token: 128229 '<|reserved_special_token_224|>' is not marked as EOG
llm_load_vocab: control token: 128230 '<|reserved_special_token_225|>' is not marked as EOG
llm_load_vocab: control token: 128234 '<|reserved_special_token_229|>' is not marked as EOG
llm_load_vocab: control token: 128236 '<|reserved_special_token_231|>' is not marked as EOG
llm_load_vocab: control token: 128240 '<|reserved_special_token_235|>' is not marked as EOG
llm_load_vocab: control token: 128248 '<|reserved_special_token_243|>' is not marked as EOG
llm_load_vocab: control token: 128250 '<|reserved_special_token_245|>' is not marked as EOG
llm_load_vocab: control token: 128251 '<|reserved_special_token_246|>' is not marked as EOG
llm_load_vocab: control token: 128252 '<|reserved_special_token_247|>' is not marked as EOG
llm_load_vocab: control token: 128253 '<|reserved_special_token_248|>' is not marked as EOG
llm_load_vocab: control token: 128254 '<|reserved_special_token_249|>' is not marked as EOG
llm_load_vocab: control token: 128255 '<|reserved_special_token_250|>' is not marked as EOG
llm_load_vocab: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
llm_load_vocab: special tokens cache size = 256
llm_load_vocab: token to piece cache size = 0.8000 MB
llm_load_print_meta: format = GGUF V3 (latest)
llm_load_print_meta: arch = llama
llm_load_print_meta: vocab type = BPE
llm_load_print_meta: n_vocab = 128256
llm_load_print_meta: n_merges = 280147
llm_load_print_meta: vocab_only = 0
llm_load_print_meta: n_ctx_train = 8192
llm_load_print_meta: n_embd = 4096
llm_load_print_meta: n_layer = 32
llm_load_print_meta: n_head = 32
llm_load_print_meta: n_head_kv = 8
llm_load_print_meta: n_rot = 128
llm_load_print_meta: n_swa = 0
llm_load_print_meta: n_embd_head_k = 128
llm_load_print_meta: n_embd_head_v = 128
llm_load_print_meta: n_gqa = 4
llm_load_print_meta: n_embd_k_gqa = 1024
llm_load_print_meta: n_embd_v_gqa = 1024
llm_load_print_meta: f_norm_eps = 0.0e+00
llm_load_print_meta: f_norm_rms_eps = 1.0e-05
llm_load_print_meta: f_clamp_kqv = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale = 0.0e+00
llm_load_print_meta: n_ff = 14336
llm_load_print_meta: n_expert = 0
llm_load_print_meta: n_expert_used = 0
llm_load_print_meta: causal attn = 1
llm_load_print_meta: pooling type = 0
llm_load_print_meta: rope type = 0
llm_load_print_meta: rope scaling = linear
llm_load_print_meta: freq_base_train = 500000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_ctx_orig_yarn = 8192
llm_load_print_meta: rope_finetuned = unknown
llm_load_print_meta: ssm_d_conv = 0
llm_load_print_meta: ssm_d_inner = 0
llm_load_print_meta: ssm_d_state = 0
llm_load_print_meta: ssm_dt_rank = 0
llm_load_print_meta: ssm_dt_b_c_rms = 0
llm_load_print_meta: model type = 8B
llm_load_print_meta: model ftype = Q8_0
llm_load_print_meta: model params = 8.03 B
llm_load_print_meta: model size = 7.95 GiB (8.50 BPW)
llm_load_print_meta: general.name = Llama-3-Smaug-8B
llm_load_print_meta: BOS token = 128000 '<|begin_of_text|>'
llm_load_print_meta: EOS token = 128001 '<|end_of_text|>'
llm_load_print_meta: EOT token = 128009 '<|eot_id|>'
llm_load_print_meta: PAD token = 128001 '<|end_of_text|>'
llm_load_print_meta: LF token = 128 ' '
llm_load_print_meta: EOG token = 128001 '<|end_of_text|>'
llm_load_print_meta: EOG token = 128009 '<|eot_id|>'
llm_load_print_meta: max token length = 256
llm_load_tensors: tensor 'token_embd.weight' (q8_0) (and 0 others) cannot be used with preferred buffer type CPU_AARCH64, using CPU instead
llm_load_tensors: offloading 32 repeating layers to GPU
llm_load_tensors: offloading output layer to GPU
llm_load_tensors: offloaded 33/33 layers to GPU
llm_load_tensors: CPU_Mapped model buffer size = 532.31 MiB
llm_load_tensors: CUDA0 model buffer size = 3757.53 MiB
llm_load_tensors: CUDA1 model buffer size = 3847.80 MiB
.........................................................................................
llama_new_context_with_model: n_seq_max = 1
llama_new_context_with_model: n_ctx = 8192
llama_new_context_with_model: n_ctx_per_seq = 8192
llama_new_context_with_model: n_batch = 512
llama_new_context_with_model: n_ubatch = 512
llama_new_context_with_model: flash_attn = 0
llama_new_context_with_model: freq_base = 500000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init: CUDA0 KV buffer size = 544.00 MiB
llama_kv_cache_init: CUDA1 KV buffer size = 480.00 MiB
llama_new_context_with_model: KV self size = 1024.00 MiB, K (f16): 512.00 MiB, V (f16): 512.00 MiB
llama_new_context_with_model: CUDA_Host output buffer size = 0.49 MiB
llama_new_context_with_model: pipeline parallelism enabled (n_copies=4)
llama_new_context_with_model: CUDA0 compute buffer size = 640.01 MiB
llama_new_context_with_model: CUDA1 compute buffer size = 640.02 MiB
llama_new_context_with_model: CUDA_Host compute buffer size = 72.02 MiB
llama_new_context_with_model: graph nodes = 1030
llama_new_context_with_model: graph splits = 3
AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | AMX_INT8 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | RISCV_VECT = 0 | WASM_SIMD = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 |
Model metadata: {'general.name': 'Llama-3-Smaug-8B', 'general.architecture': 'llama', 'llama.block_count': '32', 'llama.context_length': '8192', 'tokenizer.ggml.eos_token_id': '128001', 'general.file_type': '7', 'llama.attention.head_count_kv': '8', 'llama.embedding_length': '4096', 'llama.feed_forward_length': '14336', 'llama.attention.head_count': '32', 'llama.rope.freq_base': '500000.000000', 'llama.attention.layer_norm_rms_epsilon': '0.000010', 'llama.vocab_size': '128256', 'llama.rope.dimension_count': '128', 'tokenizer.ggml.model': 'gpt2', 'general.quantization_version': '2', 'tokenizer.ggml.bos_token_id': '128000', 'tokenizer.ggml.padding_token_id': '128001', 'tokenizer.chat_template': "{% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}{{ content }}{% endfor %}{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}{% endif %}"}
Available chat formats from metadata: chat_template.default
Guessed chat format: llama-3
10:55:35.527 [INFO ] private_gpt.components.embedding.embedding_component - Initializing the embedding model in mode=huggingface
10:55:35.902 [INFO ] sentence_transformers.SentenceTransformer - Load pretrained SentenceTransformer: nomic-ai/nomic-embed-text-v1.5
10:55:36.840 [WARNING ] transformers_modules.nomic-ai.nomic-bert-2048.eb02ceb48c1fdcc477ff1925c9732c379f0f0d1f.modeling_hf_nomic_bert -
10:55:36.902 [INFO ] sentence_transformers.SentenceTransformer - 2 prompts are loaded, with the keys: ['query', 'text']
10:56:07.999 [INFO ] llama_index.core.indices.loading - Loading all indices.
10:56:22.796 [INFO ] private_gpt.ui.ui - Mounting the gradio UI, at path=/
10:56:22.843 [INFO ] uvicorn.error - Started server process [2224]
10:56:22.843 [INFO ] uvicorn.error - Waiting for application startup.
10:56:22.843 [INFO ] uvicorn.error - Application startup complete.
10:56:22.843 [INFO ] uvicorn.error - Uvicorn running on http://0.0.0.0:8001 (Press CTRL+C to quit)
The text was updated successfully, but these errors were encountered: