Based on WhiteRabbitNeo-7B-v1.5a, WhiteRabbitNeo is a model series that can be used for offensive and defensive cybersecurity.

7b

2,081 5 months ago

Readme

Based on WhiteRabbitNeo-7B-v1.5a, part of a model series that can be used for offensive and defensive cybersecurity. It is quantized to q4_K_M for a balance between memory usage and output quality.

WhiteRabbitNeo is an AI company focused on cybersecurity. They have developed a series of models that can be used for both offensive and defensive cybersecurity. These models are designed to identify vulnerabilities in systems to make them more secure. You can check out their models on their HuggingFace: https://huggingface.co/WhiteRabbitNeo.

You can also use their managed inference service at: https://www.whiterabbitneo.com/

WhiteRabbitNeo


WhiteRabbitNeo


WhiteRabbitNeo is a model series that can be used for offensive and defensive cybersecurity.

Our models are now getting released as a public preview of its capabilities, and also to assess the societal impact of such an AI.

import torch, json
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "WhiteRabbitNeo/WhiteRabbitNeo-7B-v1.5a"

model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype=torch.float16,
    device_map="auto",
    load_in_4bit=False,
    load_in_8bit=True,
    trust_remote_code=True,
)

tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)


def generate_text(instruction):
    tokens = tokenizer.encode(instruction)
    tokens = torch.LongTensor(tokens).unsqueeze(0)
    tokens = tokens.to("cuda")

    instance = {
        "input_ids": tokens,
        "top_p": 1.0,
        "temperature": 0.5,
        "generate_len": 1024,
        "top_k": 50,
    }

    length = len(tokens[0])
    with torch.no_grad():
        rest = model.generate(
            input_ids=tokens,
            max_length=length + instance["generate_len"],
            use_cache=True,
            do_sample=True,
            top_p=instance["top_p"],
            temperature=instance["temperature"],
            top_k=instance["top_k"],
            num_return_sequences=1,
        )
    output = rest[0][length:]
    string = tokenizer.decode(output, skip_special_tokens=True)
    answer = string.split("USER:")[0].strip()
    return f"{answer}"



conversation = f"SYSTEM: You are an AI that code. Answer with code."


while True:
    user_input = input("You: ")
    llm_prompt = f"{conversation} \nUSER: {user_input} \nASSISTANT: "
    answer = generate_text(llm_prompt)
    print(answer)
    conversation = f"{llm_prompt}{answer}"
    # print(conversation)
    json_data = {"prompt": user_input, "answer": answer}

    # print(json_data)
    # with open(output_file_path, "a") as output_file:
    #     output_file.write(json.dumps(json_data) + "\n")