Llama 3.1 shipped natively with function calling support, but instead of specifying a tool_choice parameter like traditional function calling, it works with a special prompt syntax. Let’s take a look at how to do function calling with Llama 3.1 models – strictly with a custom prompt!
According to Meta, if you want to do a full conversation with tool calling, Llama 3.1 70B and Llama 3.1 405B are the two recommended options. Llama 3.1 8B is good for zero shot tool calling, but can’t hold a full conversation at the same time.

Function calling w/ Llama 3.1 70B

Say we have a function called weatherTool and want to pass it to LLama 3.1 to be able to call it when it sees fit. We’ll define the function attributes, use the special prompt from llama-agentic-system (from Meta) to pass the function to the model in the system prompt, and send in a prompt asking how the weather is in Tokyo.
import json
from together import Together

together = Together()

weatherTool = {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
    "type": "object",
    "properties": {
        "location": {
            "type": "string",
            "description": "The city and state, e.g. San Francisco, CA",
        },
    },
    "required": ["location"],
},
}

toolPrompt = f"""
You have access to the following functions:

Use the function '{weatherTool["name"]}' to '{weatherTool["description"]}':
{json.dumps(weatherTool)}

If you choose to call a function ONLY reply in the following format with no prefix or suffix:

<function=example_function_name>{{\"example_name\": \"example_value\"}}</function>

Reminder:
- Function calls MUST follow the specified format, start with <function= and end with </function>
- Required parameters MUST be specified
- Only call one function at a time
- Put the entire function call reply on one line
- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls

"""

messages = [
{
    "role": "system",
    "content": toolPrompt,
},
{
    "role": "user",
    "content": "What is the weather in Tokyo?",
},

]

response = together.chat.completions.create(
model="meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
messages=messages,
max_tokens=1024,
temperature=0,
)

messages.append(response.choices[0].message)
print(response.choices[0].message.content)