Run LLM from Linux Command Line- Immortality Knowledge Base

Run LLM from Linux Command Line

This revision is from 2024/07/16 20:43. You can Restore it.

The .gguf format is typically used with the llama.cpp project and its Python bindings, not with the Hugging Face Transformers library. To use a .gguf file, you need to use a different library, such as llama-cpp-python.

python3 -m pip install --upgrade pip

pip install transformers

pip install torch

pip install llama-cpp-python

Python script:


import os
from transformers import pipeline

model_path = "/path/to/Lexi-Llama-3-8B-Uncensored_Q8_0.gguf"
prompt = "What is the meaning of life?"

# Load the model
llm = pipeline('text-generation', model=model_path)

# Generate a response
response = llm(prompt)

# Print the response
print(response[0]['generated_text'])

Execute:

python llm_script.py

To load a hugging face transformer model directly...


from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "internlm/internlm2-chat-7b"  # This is the Hugging Face model ID
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

prompt = "What is the meaning of life?"
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(**inputs, max_new_tokens=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(response)

📝 📜 ⏱️ ⬆️