아래 파이썬 코드를 참조하여서 테스트를 진행해보시면 됩니다.

import requests import json

모델 초기화 및 설정

api_url = “http://localhost:11434/api/generate” headers = { “Content-Type”: “application/json” }

instruction = “구독자들을 위한 인사말을 작성해줘” input_text = “같이할래 코딩 유튜브”

요청 데이터 포맷팅

data = { “model”: “sungwoo/llama3.1-ko-conversation”, # 사용할 모델 이름 “prompt”: f”{instruction}\n{input_text}\n”, “parameters”: { “max_tokens”: 200 # 생성할 토큰의 최대 수 } }

POST 요청으로 모델에 입력 데이터 전송 및 스트리밍 응답 받기

response = requests.post(api_url, headers=headers, data=json.dumps(data), stream=True)

if response.status_code == 200: # 스트리밍 응답을 하나의 텍스트로 결합 final_response = “” for line in response.iter_lines(): if line: # 스트리밍된 응답을 JSON으로 파싱 response_data = json.loads(line.decode(‘utf-8’)) if ‘response’ in response_data: final_response += response_data[‘response’] if response_data.get(‘done’, False): break print(“Generated Text:”, final_response) else: print(“Failed to generate response:”, response.text)”

llama3.1-korean-mrc

Readme

아래 파이썬 코드를 참조하여서 테스트를 진행해보시면 됩니다.

모델 초기화 및 설정

요청 데이터 포맷팅

POST 요청으로 모델에 입력 데이터 전송 및 스트리밍 응답 받기