Run LLM from Linux Command Line
This revision is from 2024/07/16 20:38. You can Restore it.
The .gguf format is typically used with the llama.cpp project and its Python bindings, not with the Hugging Face Transformers library. To use a .gguf file, you need to use a different library, such as llama-cpp-python.
python3 -m pip install --upgrade pip
pip install transformers
pip install torch
pip install llama-cpp-python
Python script:
import os
from transformers import pipeline
model_path = "/path/to/Lexi-Llama-3-8B-Uncensored_Q8_0.gguf"
prompt = "What is the meaning of life?"
# Load the modelllm = pipeline('text-generation', model=model_path)
# Generate a responseresponse = llm(prompt)
# Print the responseprint(response[0]['generated_text'])
Execute:
python llm_script.py