jackboi / tinyllama-resume

Overview This repository contains a proof-of-concept model for fine-tuning a ‘light’ Large Language Model (LLM) on a resume-to-labor class dataset. The model is designed to classify resumes into various professional categories based on the skills and experiences detailed in the resume.

Model Details Type: Fine-tuned ‘light’ LLM Purpose: Resume classification Training Data: Approximately 4,000 synthetic resumes

View the full labor_categories.json

View a sample of the training data

Labor Categories The model is trained to classify resumes into the following labor categories:

Business Administration Contracts Administration Cyber Security Cyber Security Technical Analysis Data Analysis Data Science Engineering (General) Executives Financial Analysis Intelligence Analysis

Training Data The training dataset consists of synthetic resumes created to represent varying skill levels within each of the above labor categories. These synthetic resumes were generated to capture the diverse range of experiences, skills, and qualifications typical of professionals in these fields.

Use Case This model can be used to: Automatically categorize job applications Assist in HR processes for initial resume screening Help job seekers understand which labor category their resume might fall into

Limitations As a proof of concept, this model may not capture all nuances of real-world resumes

Performance may vary on resumes from fields not represented in the training data

The model’s accuracy on real-world data should be thoroughly evaluated before any production use

Future Work Expand the training dataset with more diverse and real-world resumes Fine-tune the model on additional labor categories Evaluate and improve the model’s performance on edge cases and multi-category resumes

**Resume Classification Model**

**_Overview_**
This repository contains a proof-of-concept model for fine-tuning a 'light' Large Language Model (LLM) on a resume-to-labor class dataset. The model is designed to classify resumes into various professional categories based on the skills and experiences detailed in the resume.

**_Model Details_**
Type: Fine-tuned 'light' LLM
Purpose: Resume classification
Training Data: Approximately 4,000 synthetic resumes

**[View the full labor_categories.json](https://github.com/j-webtek/FileReference/blob/main/labor_categories)**

**[View a sample of the training data](https://github.com/j-webtek/FileReference/blob/main/Sample_Training_Data)**

**_Labor Categories_**
The model is trained to classify resumes into the following labor categories:

Business Administration
Contracts Administration
Cyber Security
Cyber Security Technical Analysis
Data Analysis
Data Science
Engineering (General)
Executives
Financial Analysis
Intelligence Analysis

**_Training Data_**
The training dataset consists of synthetic resumes created to represent varying skill levels within each of the above labor categories. These synthetic resumes were generated to capture the diverse range of experiences, skills, and qualifications typical of professionals in these fields.

**_Use Case_**
This model can be used to:
Automatically categorize job applications
Assist in HR processes for initial resume screening
Help job seekers understand which labor category their resume might fall into

**_Limitations_**
As a proof of concept, this model may not capture all nuances of real-world resumes

Performance may vary on resumes from fields not represented in the training data

The model's accuracy on real-world data should be thoroughly evaluated before any production use

**_Future Work_**
Expand the training dataset with more diverse and real-world resumes
Fine-tune the model on additional labor categories
Evaluate and improve the model's performance on edge cases and multi-category resumes

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)