Client & Chat

Setup

Exported source
from gaspare.utils import *
from gaspare.core import *

import os

from google import genai
from google.genai import types

from fastcore.all import *
c = genai.client.Client(api_key=os.environ.get("GEMINI_API_KEY"))

Content generation

One of the biggest quality of life improvements of the Claudette family of libraries is the __call__ method for the client. We can handle it easily with the functions introduced, taking into account that some of the most common parameters (like model) might be stored on the client itself.

We only have to to patch the __call__ method so that 1. We have an API compatible with Claudette 2. We can pass all the parameters for types.GenerateContentConfigDict

As a general rule we follow this order of precedence:

  1. Named parameters that are set in the call itself
  2. Named parameters passed as kwargs
  3. Parameters set in the client (only for model, sp and temp at the moment)

call’]

*Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.*

Exported source
@patch
def __call__(self: genai.models.Models | genai.models.AsyncModels, 
             inps=None, # The inputs to be passed to the model
             sp:str='', # Optional system prompt
             temp:float=0.6, # Temperature
             maxtok:int|None=None, # Maximum number of tokens for the output
             stream:bool=False, # Stream response?
             stop:str|list[str]|None=None, # Stop sequence[s]
             tools=None, # A list of functions or tools to be passed to the model
             use_afc=False, # Use Google's automatic function calling? If False, functions will be converted to tools
             # `AUTO` lets the model decide whether tools to use, 
             # `ANY` forces the model to call a function `NONE` avoids any function calling
             tool_mode='AUTO',
             maxthinktok:int=8000, # "Thinking" token budget for models that allow it
             **kwargs):
    """Call to a Gemini LLM"""
    kwargs["model"] = kwargs.get("model", getattr(self, "model", None))
    model = kwargs["model"]
    prepped_tools = None
    if tools:
        self._tools = tools
        prepped_tools = prep_tools(tools, toolify_everything=not use_afc)  
    t_budget = maxthinktok if model in thinking_models else None
    config = self._genconf(sp=sp, temp=temp, maxtok=maxtok, stop=stop, tools=prepped_tools, 
                           tool_mode=tool_mode, maxthinktok=t_budget, **kwargs)
    
    contents = mk_contents(inps, cli=kwargs.get('client', None))    
    gen_f = self.generate_content_stream if stream else self.generate_content
    r = gen_f(model=model, contents=contents, config=config if config else None)
    return self._stream(r, think=t_budget) if stream else self._r(r, think=t_budget)


@patch
@delegates(genai.models.Models.__call__)
def __call__(self: genai.Client | genai.client.AsyncClient, inps=None, **kwargs):
    return self.models(inps, client=self, **kwargs)

call’]

*Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.*

c("Hi Gemini", model=models[0])
Hi there! How can I help you today?
  • candidates:
    candidates[0]
    • content:
      • role: model
      • parts:
        parts[0]
        • text: Hi there! How can I help you today?
    • finish_reason: FinishReason.STOP
    • avg_logprobs: -0.006288008933717554
  • automatic_function_calling_history:
  • usage_metadata: Cached: 0; In: 2; Out: 11; Total: 13
  • model_version: gemini-2.0-flash
c("give me a numbered list of 10 animals", 
  stop="4", 
  sp='always talk in Spanish', 
  maxtok=10, model=models[0])
¡Claro que sí! Aquí tienes una lista
  • candidates:
    candidates[0]
    • content:
      • role: model
      • parts:
        parts[0]
        • text: ¡Claro que sí! Aquí tienes una lista
    • finish_reason: FinishReason.MAX_TOKENS
    • avg_logprobs: -0.1549281809065077
  • automatic_function_calling_history:
  • usage_metadata: Cached: 0; In: 14; Out: 9; Total: 23
  • model_version: gemini-2.0-flash

Client Constructor

Since we have limited ourselves to adding functionalities to the SDK client, our Client is just a function that stores a few parameters on it to make the generation more convenient, rather than a class on its own. This means that everything in the official SDK is still working, plus we get all the nicities of being Claudette’s compatible.


source

Client

 Client (model:str, sp:str='', temp:float=0.6, text_only:bool=False)

An extension of google.genai.Client with a series of quality of life improvements

Type Default Details
model str The model to be used by default (can be overridden when generating)
sp str System prompt
temp float 0.6 Default temperature
text_only bool False Suppress multimodality even if the model allows for it
Exported source
def Client(model:str, # The model to be used by default (can be overridden when generating)
           sp:str='', # System prompt
           temp:float=0.6, # Default temperature
           text_only:bool=False, # Suppress multimodality even if the model allows for it
          ): 
    """An extension of `google.genai.Client` with a series of quality of life improvements"""
    c = genai.Client(api_key=os.environ['GEMINI_API_KEY'])
    c.models.post_cbs = [c.models._call_tools]
    c.models.model, c.models.sp, c.models.temp, c.models.text_only = model, sp, temp, text_only
    return c
cli = Client(model=models[0])
r = cli(["What is this video about?", "https://youtu.be/r-GSGH2RxJs?si=qoKg_wl5KBV6sIjw"])
r
This video is an introduction to HTMX, a library for the web that can replace JavaScript frameworks with the simplicity of HTML. The video explains how HTML is more capable than people think. HTMX embraces an architectural constraint known as hypermedia as the engine of application state. This library adds new attributes to HTML that can handle complex requirements. It gives you the ability to make a request to the server from any element by providing an attribute with an HTTP verb and a URL endpoint on the server. It’ll take the response from the server and replace this element asynchronously. It can also specify a target to replace a different element on the page. HTMX can customize the event on which it’s triggered along with modifiers like delay and throttle to control the way the request is sent. It also keeps track of the loading state so you can show a spinner, apply CSS transitions for animation, and builds on the HTML validation API to validate forms. The library even has a client-side router called Boost that can make any traditional web application feel like a faster single-page application. HTMX also includes extensions for more advanced features like web sockets and integrations with other HTML frameworks like Alpine. To install it, simply import HTMX in a script tag from the head of an HTML document. All you need is a server in your favorite programming language that returns HTML text as a response.
  • candidates:
    candidates[0]
    • finish_reason: FinishReason.STOP
    • content:
      • role: model
      • parts:
        parts[0]
        • text: This video is an introduction to HTMX, a library for the web that can replace JavaScript frameworks with the simplicity of HTML. The video explains how HTML is more capable than people think. HTMX embraces an architectural constraint known as hypermedia as the engine of application state. This library adds new attributes to HTML that can handle complex requirements. It gives you the ability to make a request to the server from any element by providing an attribute with an HTTP verb and a URL endpoint on the server. It’ll take the response from the server and replace this element asynchronously. It can also specify a target to replace a different element on the page. HTMX can customize the event on which it’s triggered along with modifiers like delay and throttle to control the way the request is sent. It also keeps track of the loading state so you can show a spinner, apply CSS transitions for animation, and builds on the HTML validation API to validate forms. The library even has a client-side router called Boost that can make any traditional web application feel like a faster single-page application. HTMX also includes extensions for more advanced features like web sockets and integrations with other HTML frameworks like Alpine. To install it, simply import HTMX in a script tag from the head of an HTML document. All you need is a server in your favorite programming language that returns HTML text as a response.
    • citation_metadata:
      • citations:
        citations[0]
        • start_index: 369
        • end_index: 489
        citations[1]
        • start_index: 684
        • end_index: 815
        citations[2]
        • start_index: 821
        • end_index: 970
        citations[3]
        • uri: https://letsusetech.com/the-awesome-things-you-can-do-with-htmx
        • start_index: 992
        • end_index: 1118
        citations[4]
        • start_index: 1194
        • end_index: 1333
    • avg_logprobs: -0.3231910566850142
  • automatic_function_calling_history:
  • usage_metadata: Cached: 0; In: 6; Out: 275; Total: 281
  • model_version: gemini-2.0-flash
cli.models.model = "gemini-2.0-flash-exp-image-generation"

cli("Generate an image of a mountain view and write a short poem about it")


Silent Sentinels

Stone giants rise, in hues of gray,
Beneath a sky where clouds hold sway.
A breath of wind, a distant sound,
Peace on the heights, where none are bound.
  • candidates:
    candidates[0]
    • content:
      • role: model
      • parts:
        parts[0]
        • inline_data:
          • data: b’89PNG1a…’
          • mime_type: image/png
        parts[1]
        • text:

          Silent Sentinels

          Stone giants rise, in hues of gray, Beneath a sky where clouds hold sway. A breath of wind, a distant sound, Peace on the heights, where none are bound.
    • finish_reason: FinishReason.STOP
    • index: 0
  • automatic_function_calling_history:
  • usage_metadata: Cached: 0; In: 15; Out: 47; Total: 62
  • model_version: gemini-2.0-flash-exp-image-generation
cli("What is the Capital city of Burundi?")
The capital city of Burundi is Gitega.
  • candidates:
    candidates[0]
    • content:
      • role: model
      • parts:
        parts[0]
        • text: The capital city of Burundi is Gitega.
    • finish_reason: FinishReason.STOP
    • index: 0
  • automatic_function_calling_history:
  • usage_metadata: Cached: 0; In: 9; Out: 11; Total: 20
  • model_version: gemini-2.0-flash-exp-image-generation
cli.models.text_only = True

cli("Generate an image of a duck and give it a name")
Okay, here’s an image of a duck, and I’ve named him “Quackers”:

  • candidates:
    candidates[0]
    • content:
      • role: model
      • parts:
        parts[0]
        • text: Okay, here’s an image of a duck, and I’ve named him “Quackers”:

    • finish_reason: FinishReason.STOP
    • avg_logprobs: -0.44406358400980633
  • automatic_function_calling_history:
  • usage_metadata: Cached: 0; In: 11; Out: 24; Total: 35
  • model_version: gemini-2.0-flash-exp-image-generation
Exported source
@patch
def _repr_markdown_(self:genai.models.Models | genai.models.AsyncModels):
    if not hasattr(self,'result'): return 'No results yet'
    msg = self.result._repr_markdown_()
    return f"""{msg}


|        | Input | Output | Cached |
|--------|------:|-------:|-------:|
| Tokens | {self.use.inp:,} | {self.use.out:,}  | {self.use.cached:,}  |
| **Totals** | **Tokens: {self.use.total:,}** | **${self.cost:.6f}** |  |
"""

@patch
def _repr_markdown_(self:genai.Client | genai.client.AsyncClient): return self.models._repr_markdown_()
cli
Okay, here’s an image of a duck, and I’ve named him “Quackers”:

  • candidates:
    candidates[0]
    • content:
      • role: model
      • parts:
        parts[0]
        • text: Okay, here’s an image of a duck, and I’ve named him “Quackers”:

    • finish_reason: FinishReason.STOP
    • avg_logprobs: -0.44406358400980633
  • automatic_function_calling_history:
  • usage_metadata: Cached: 0; In: 11; Out: 24; Total: 35
  • model_version: gemini-2.0-flash-exp-image-generation
Input Output Cached
Tokens 41 357 0
Totals Tokens: 398 $0.000111

Structured calls


structured’]

*Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.*

Exported source
@patch
def structured(self: genai.models.Models, inps, tool, model=None, **kwargs):
    _ = self(inps,  tools=[tool], use_afc=False, tool_mode="ANY", temp=0., stream=False, **kwargs)
    return [nested_idx(ct, "function_response", "response", "result") for ct in nested_idx(self, "result_content", -1, "parts") or []]

@patch
async def structured(self: genai.models.AsyncModels, inps, tool, model=None, **kwargs):
    _ = await self(inps,  tools=[tool], use_afc=False, tool_mode="ANY", temp=0., stream=False, **kwargs)
    return [nested_idx(ct, "function_response", "response", "result") for ct in nested_idx(self, "result_content", -1, "parts") or []]

@patch
def structured(self: genai.Client | genai.client.AsyncClient, inps, tool, model=None):
    return self.models.structured(inps, tool, model)

AsyncModels.structured

 AsyncModels.structured (inps, tool, model=None, **kwargs)

Models.structured

 Models.structured (inps, tool, model=None, **kwargs)

Defining the structured interface is essentially a matter of setting tool_mode to ANY (so that the LLM is forced to use the passed tool), and returning the function result`

cli = Client(model=models[0])

def addition(
    a:int, # the 1st number to add
    b=0,   # the 2nd number to add
)->int:    # the result of adding `a` to `b`
    "Sums two numbers."
    return a+b


a,b = 694599,645893212
pr = f"What is {a}+{b}?"
cli.structured(pr, addition)
[646587811]
class President:
    "Information about a president of the United States"
    def __init__(self, 
                first:str, # President first name.
                last:str, # President last name.
                spouse:str, # President's spouse name. 
                years_in_office:str, # President years in office, formatted as: {start_year}-{end_year}
                birth_year:int=0 # President year of birth (`0` if unknown).
        ):
        assert re.match(r'\d{4}-\d{4}', years_in_office), "Invalid format: `years_in_office`: should be : '{start_year}-{end_year}'"
        store_attr()

    __repr__ = basic_repr('first, last, spouse, years_in_office, birth_year')
cli.structured("Key details about the first 10 Presidents of the United States", President)
[President(first='George', last='Washington', spouse='Martha Dandridge Custis', years_in_office='1789-1797', birth_year=1732),
 President(first='John', last='Adams', spouse='Abigail Smith', years_in_office='1797-1801', birth_year=1735),
 President(first='Thomas', last='Jefferson', spouse='Martha Wayles Skelton', years_in_office='1801-1809', birth_year=1743),
 President(first='James', last='Madison', spouse='Dolley Payne Todd', years_in_office='1809-1817', birth_year=1751),
 President(first='James', last='Monroe', spouse='Elizabeth Kortright', years_in_office='1817-1825', birth_year=1758),
 President(first='John Quincy', last='Adams', spouse='Louisa Catherine Johnson', years_in_office='1825-1829', birth_year=1767),
 President(first='Andrew', last='Jackson', spouse='Rachel Donelson Robards', years_in_office='1829-1837', birth_year=1767),
 President(first='Martin', last='Van Buren', spouse='Hannah Hoes', years_in_office='1837-1841', birth_year=1782),
 President(first='William Henry', last='Harrison', spouse='Anna Tuthill Symmes', years_in_office='1841-1841', birth_year=1773),
 President(first='John', last='Tyler', spouse='Letitia Christian Tyler', years_in_office='1841-1845', birth_year=1790)]

TODO: Right now we are not using the structured output capabilities of the Genai API, which allow to pas Pydantic models or JSON schemas to structure the output of the response.

Image generation

Although it’s not particularly complex, we can also add a simple interface to use the latest imagen model for image generation.


imagen’]

*Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.*

Exported source
@patch
def imagen(self: genai.models.Models | genai.models.AsyncModels,
           prompt:str, # Prompt for the image to be generated
           n_img:int=1): # Number of images to be generated (1-8)
    """Generate one or more images using the latest Imagen model."""
    return self.generate_images(
                model = [m for m in imagen_models if 'imagen' in m][0],
                prompt=prompt, config={"number_of_images": n_img})

@patch
@delegates(to=genai.models.Models)
def imagen(self: genai.Client | genai.client.AsyncClient, prompt, **kwargs):
    return self.models.imagen(prompt, **kwargs)

imagen’]

*Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.*

imr = cli.imagen("A photorealistic image of a brightly colored Mallard duck. \
The duck is standing on a grassy bank next to a pond. The water is calm and reflects the sky. \
The duck has a vibrant green head, a yellow bill, and a brown chest. Its tail feathers are slightly raised. \
The duck looks alert and curious, as if it's just noticed something. The overall lighting is soft and natural, \
creating a peaceful and serene atmosphere. The style should be realistic, like a professional nature photograph.")

imr
  • generated_images:
    generated_images[0]
    • image:
      • image_bytes: b’89PNG1a…’
      • mime_type: image/png

Chat interface

Creating the Chat interface is relatively easy, since the Genai SDK already has a similar object and methods that take care of storing the message history. We only need to make sure to add properties to easily access the history and client and a call back that records the message into the history as soon as it the generation is finished.


source

Chat

 Chat (model:str, sp:str='', temp:float=0.6, text_only:bool=False,
       cli:google.genai.client.Client|None=None)

An extension of google.genai.chats.Chat with a series of quality of life improvements

Type Default Details
model str The model to be used
sp str System prompt
temp float 0.6 Default temperature
text_only bool False Suppress multimodality even if the model allows for it
cli google.genai.client.Client | None None Optional Client to be passed (to keep track of usage)
Exported source
valid_func = genai.chats._validate_response

@patch(as_prop=True)
def c(self: genai.chats.Chat | genai.chats.AsyncChat): return self._modules

@patch(as_prop=True)
def h(self: genai.chats.Chat | genai.chats.AsyncChat): return self._curated_history

@patch(as_prop=True)
def full_h(self: genai.chats.Chat | genai.chats.AsyncChat): return self._comprehensive_history

@patch
def _rec_res(self: genai.chats.Chat | genai.chats.AsyncChat, resp):
    if not getattr(self, "user_query", False): return
    resp_c = nested_idx(resp, "candidates", 0, "content")
    self.record_history(
        user_input=self.user_query,
        model_output=[resp_c] if resp_c else [],
        automatic_function_calling_history=resp.automatic_function_calling_history,
        is_valid=valid_func(resp)
    )

def Chat(model:str, # The model to be used 
           sp:str='', # System prompt
           temp:float=0.6, # Default temperature
           text_only:bool=False, # Suppress multimodality even if the model allows for it
           cli:genai.Client|None=None, # Optional Client to be passed (to keep track of usage)
          ): 
    """An extension of `google.genai.chats.Chat` with a series of quality of life improvements"""        
    c = Client(model, sp, temp, text_only) if cli is None else cli
    c.model, c.sp, c.temp, c.text_only = model, sp, temp, text_only
    chat = c.chats.create(model=c.model)
    chat.c.post_cbs.insert(0, chat._rec_res)
    return chat

full_h’]

*Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.*


h’]

*Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.*


c’]

*Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.*

The __call__ method only needs to make sure that we can handle the tool calling easily


call’]

*Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.*

Exported source
@patch
@delegates(genai.Client.__call__, keep=True)
def __call__(self: genai.chats.Chat | genai.chats.AsyncChat, inps=None, **kwargs):
    self.user_query = mk_content(inps) if inps else self.c.result_content[-1]
    return self.c(self.h + [self.user_query], **kwargs)
chat = Chat(models[0])
chat("Hi, I am Miko", stream=False)
Hi Miko! It’s nice to meet you. How can I help you today?
  • candidates:
    candidates[0]
    • content:
      • role: model
      • parts:
        parts[0]
        • text: Hi Miko! It’s nice to meet you. How can I help you today?
    • finish_reason: FinishReason.STOP
    • avg_logprobs: -0.006754375602069654
  • automatic_function_calling_history:
  • usage_metadata: Cached: 0; In: 5; Out: 19; Total: 24
  • model_version: gemini-2.0-flash
chat("What is my name again?")
Your name is Miko.
  • candidates:
    candidates[0]
    • content:
      • role: model
      • parts:
        parts[0]
        • text: Your name is Miko.
    • finish_reason: FinishReason.STOP
    • avg_logprobs: -0.002763839748998483
  • automatic_function_calling_history:
  • usage_metadata: Cached: 0; In: 30; Out: 6; Total: 36
  • model_version: gemini-2.0-flash
for chunk in chat("Write a short poem with my name in it", stream=True): print(chunk, end='')
The sun dips low, a fiery show,
While shadows lengthen, soft and slow.
A gentle breeze begins to blow,
And whispers secrets, soft and low.

**Miko**, bright, a shining grace,
A smile that lights up any place.
A gentle heart, a kind embrace,
Leaving beauty in your trace.
chat("Generate an image of a cute Bigfoot cub.", model=imagen_models[0])
  • candidates:
    candidates[0]
    • content:
      • role: model
      • parts:
        parts[0]
        • inline_data:
          • data: b’89PNG1a…’
          • mime_type: image/png
    • finish_reason: FinishReason.STOP
    • index: 0
  • automatic_function_calling_history:
  • usage_metadata: Cached: 0; In: 136; Out: 0; Total: 136
  • model_version: gemini-2.0-flash-exp
chat("Add a party hat.", model=imagen_models[0])
  • candidates:
    candidates[0]
    • content:
      • role: model
      • parts:
        parts[0]
        • inline_data:
          • data: b’89PNG1a…’
          • mime_type: image/png
    • finish_reason: FinishReason.STOP
    • index: 0
  • automatic_function_calling_history:
  • usage_metadata: Cached: 0; In: 401; Out: 0; Total: 401
  • model_version: gemini-2.0-flash-exp
a,b = 123,645893212
pr = f"What is {a}+{b}?"


chat(pr, tools=[addition], use_afc=False)
  • addition(a=123, b=645893212)
  • candidates:
    candidates[0]
    • content:
      • role: model
      • parts:
        parts[0]
        • function_call:
          • name: addition
          • args:
            • a: 123
            • b: 645893212
    • finish_reason: FinishReason.STOP
    • avg_logprobs: -2.6425186661072075e-06
  • automatic_function_calling_history:
  • usage_metadata: Cached: 0; In: 727; Out: 3; Total: 730
  • model_version: gemini-2.0-flash
chat()
123 + 645893212 = 645893335
  • candidates:
    candidates[0]
    • content:
      • role: model
      • parts:
        parts[0]
        • text: 123 + 645893212 = 645893335
    • finish_reason: FinishReason.STOP
    • avg_logprobs: -0.00010100648236962465
  • automatic_function_calling_history:
  • usage_metadata: Cached: 0; In: 670; Out: 26; Total: 696
  • model_version: gemini-2.0-flash

cost’]

*Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.*

Exported source
@patch(as_prop=True)
def use(self: genai.chats.Chat | genai.chats.AsyncChat): return self.c.use

@patch(as_prop=True)
def cost(self: genai.chats.Chat | genai.chats.AsyncChat): return self.c.cost

@patch
def _repr_markdown_(self: genai.chats.Chat | genai.chats.AsyncChat):
    if not hasattr(self.c, 'result'): return 'No results yet'
    last_msg = self.c.result._repr_markdown_().split('<details>')[0]

    def content_repr(ct):
        r = ct.role
        cts = {'text': '', 'images': []}
        for part in ct.parts:
            if part.text is not None: cts['text'] += part.text
            if part.inline_data is not None:
                cts['images'].append(types.Image(image_bytes=part.inline_data.data, mime_type=part.inline_data.mime_type))
        return f"_**{r}**_: {cts['text']}" + f"{' '.join(['IMAGE_' + str(i) for i, _ in enumerate(cts['images'])])}"
    
    history = "\n\n".join([content_repr(ct) for ct in self.h])
    det = self.c._repr_markdown_().split('\n\n')[-1]
    return f"""{last_msg}

<details>
<summary>History</summary>

{history}
</details>

{det}"""

use’]

*Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.*

chat

123 + 645893212 = 645893335

History

user: Hi, I am Miko

model: Hi Miko! It’s nice to meet you. How can I help you today?

user: What is my name again?

model: Your name is Miko.

user: Write a short poem with my name in it

model: The sun dips low, a fiery show, While shadows lengthen, soft and slow. A gentle breeze begins to blow, And whispers secrets, soft and low.

Miko, bright, a shining grace, A smile that lights up any place. A gentle heart, a kind embrace, Leaving beauty in your trace.

user: Generate an image of a cute Bigfoot cub.

model: IMAGE_0

user: Add a party hat.

model: IMAGE_0

user: What is 123+645893212?

model:

tool:

model: 123 + 645893212 = 645893335

Input Output Cached
Tokens 2,014 126 0
Totals Tokens: 2,140 $0.000198