Fine-Tuning Model Stuck: Job Enqueued, Waiting for Prior Jobs to Complete

Billy Zhou 0 Reputation points
2025-06-19T01:06:29.4633333+00:00

When I use the Python SDK provided by Azure to fine-tune a GPT-4O-mini model, the progress is stuck at "Job enqueued. Waiting for the jobs ahead to complete" (for over 12 hours). Could you please help check it?

User's image

Sample python code:

from openai import AzureOpenAI
client = AzureOpenAI(
    azure_endpoint = "https://xxxxx.azure.com/",
    api_key = "xxxxx",
    api_version= "2024-12-01-preview",
)

def start_fine_tuning():
    response = client.fine_tuning.jobs.create(
        training_file = "file-f33a9fb8ef214edfaa94aa6d9a707f48xx",
        validation_file = "file-a3cf2c8e28304bea8f2edd053f8e8014xx",
        model = "gpt-4o-mini", # Enter base model name. Note that in Azure OpenAI the model name contains dashes and cannot contain dot/period characters.
        hyperparameters={
            "n_epochs":2
        },
        seed = 105 # seed parameter controls reproducibility of the fine-tuning job. If no seed is specified one will be generated automatically.
    )
    # You can use the job ID to monitor the status of the fine-tuning job.
    # The fine-tuning job will take some time to start and complete.

    print("Job ID:", response.id)
    print("Status:", response.status)
    print(response.model_dump_json(indent=2))
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,572 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Pavankumar Purilla 7,850 Reputation points Microsoft External Staff Moderator
    2025-06-19T04:26:26.8133333+00:00

    Hi Billy Zhou,

    I understand that your fine-tuning job for the GPT-4o-mini model has been in the "Job enqueued. Waiting for the jobs ahead to complete" state for over 12 hours. This status typically indicates that the job is waiting in a queue due to high demand for fine-tuning resources. Azure OpenAI fine-tuning jobs are processed using shared compute infrastructure, and at times especially with popular models like GPT-4o-mini jobs may experience longer queue times than expected.

    While 12 hours is within the range of expected queuing delays during peak periods, we recognize that this can be inconvenient. We recommend monitoring the job in the Fine-tuning section of Azure AI Studio and refreshing the portal periodically to check for status updates. If there’s no progress after a longer wait (e.g., 24 hours), you may consider canceling the current job and resubmitting it, which can sometimes resolve hidden queuing or scheduling issues.

    In parallel, we recommend checking the Azure Status Page for any ongoing service disruptions or regional capacity issues that might be affecting your fine-tuning job

    Reference thread: https://learn.microsoft.com/en-us/answers/questions/2260568/fine-tuning-job-stuck-in-training-status-for-over

    Also, you can refer Check the status of your custom model,

    Troubleshooting for Azure OpenAI fine-tuning.

    I Hope this helps. Do let me know if you have any further queries.

    Thank you!

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.