Build a Personal Fitness & Nutrition Coach with Gemini 2.5, Firestore and Cloud Run
This workshop demonstrates how to build a Personal Fitness & Nutrition Coach using Gemini 2.5, Firestore, Cloud Run, and the Agent Development Kit. Participants will see how AI can analyze food photos, recommend workouts, track nutrition goals, and personalize fitness advice.
Prerequisites
To get the most out of this guide, you should be:
- Comfortable working with Python.
- Familiar with basic full-stack architecture utilizing HTTP services.
What You’ll Learn
- Design and build an AI agent using the Agent Development Kit (ADK), exploring how tools, callbacks, and planners extend what Gemini can do.
- Use Firestore as both a database and a vector store, supporting: Retrieval-Augmented Generation (RAG) for smarter context and memory.
- Integrate multimodal capabilities, having your agent analyze images (like meals or exercises) and reason over them with text.
- Deploy the full web app application to Cloud Run, making your AI agent scalable, accessible, and production-ready.
What You’ll Need
- Google Account
- Chrome Browser
- Google Cloud Project with billing enabled to use Cloud Run, Firestore, and the Gemini API.
This guide uses Python in its sample application. While Python knowledge is helpful, the focus is on the Agent and Cloud concepts, making it accessible to developers of all backgrounds.
1. Google Cloud Setup
This section prepares your Google Cloud environment, project, and necessary services.
1.1. Select and Configure Your Cloud Project
You must have an active Google Cloud project with billing enabled.
- In the Google Cloud Console, select or create your project.
- Ensure billing is enabled for your project.
1.2. Prepare Firestore Database
Firestore in Native mode is a NoSQL document database built for automatic scaling, high performance, and ease of application development. It can also act as a vector database which can support Retrieval Augmented Generation technique for our lab.
- In the Cloud Console search bar, search for and click Firestore .
- Click the Create A Firestore Database button.
- Use the (default) database ID and select Standard Edition.
- For this lab’s demonstration, set the Security rules to OPEN.
⚠️ Caution: The Open security rules are NOT RECOMMENDED for production environments!
- Click the Create Database Button.
1.3. Activate and Configure Cloud Shell Terminal
You’ll use Cloud Shell, a pre-configured command-line environment, for the rest of the setup.
- Click Activate Cloud Shell at the top of the Google Cloud console.
- Click Authorize if it prompts you to authorize your Google account.
- Once connected to Cloud Shell, you check that you’re already authenticated and that the project is set to your project ID using the following command:
gcloud auth list
- Run the following command in Cloud Shell to confirm that the gcloud command knows about your project.
gcloud config list project
- If your project isn’t set, use the following command:
gcloud config set project <YOUR_PROJECT_ID>
1.4. Enable required APIs
Enable all necessary services for the codelab with a single command. This may take a few minutes.
gcloud services enable aiplatform.googleapis.com \
firestore.googleapis.com \
run.googleapis.com \
cloudbuild.googleapis.com \
cloudresourcemanager.googleapis.com
On success, you will see a message like: Operation “operations/…” finished successfully.
1.5. Prepare Google Cloud Storage bucket
From the same terminal, create a GCS bucket to store uploaded files (e.g., images). We’ll place it in the us-central1 region.
gsutil mb -l us-central1 gs://fitness-images
Replace fitness-images with your unique name because buckets can’t have the same id on Google Cloud. You can verify the bucket creation by navigating to Cloud Storage -> Bucket in the console.
1.6. Create Firestore Indexes for RAG
Since Firestore is a NoSQL database, we must create indexes to support compound queries and vector search for RAG. This step can take several minutes to complete.
- Run the following command to create an index for compound queries:
gcloud firestore indexes composite create \
--collection-group=fitness-images \
--field-config field-path=total_amount,order=ASCENDING \
--field-config field-path=transaction_time,order=ASCENDING \
--field-config field-path=__name__,order=ASCENDING \
--database="(default)"
- Next run this command to support vector search (for the embedding field):
gcloud firestore indexes composite create \
--collection-group="fitness-images" \
--query-scope=COLLECTION \
--field-config field-path="embedding",vector-config='{"dimension":"768", "flat": "{}"}' \
--database="(default)"
You can monitor the index creation status under Firestore > Indexes in the console.
2. Code environment setup
This section prepares your code editor and installs the necessary project dependencies.
2.1. Set up the Working Directory
- Click the Open Editor button in the Cloud Shell toolbar to launch the editor.
- Ensure your active Google Cloud project is set in the bottom-left corner (status bar) of the editor.
- Clone the starter code repository:
git clone https://github.com/Roxie-32/fitness-coach.git
- In the editor, go to File -> Open Folder and select the newly created fitness-coach directory. This sets it as your main working directory.
2.2. Prepare Python Virtual Environment
We’ll use uv (a modern Python package manager) to streamline virtual environment and dependency management.
- Open a new terminal (Terminal -> New Terminal or Ctrl + Shift + C). Your terminal should be inside the fitness-coach directory.
- Download and install uv and Python 3.12:
curl -LsSf https://astral.sh/uv/0.6.16/install.sh | sh && \
source $HOME/.local/bin/env && \
uv python install 3.12
- Initialize the virtual environment and install dependencies defined in pyproject.toml:
uv sync --frozen
Note: uv handles virtual environment creation and activation automatically. From now on, use uv run instead of the standard python command (e.g., uv run main.py).
- To test the virtual env, create new file main.py and copy the following code:
def main():
print("Hello from fitness-coach!")
if __name__ == "__main__":
main()
- Then, run the following command:
uv run main.py
You will get output like shown below
Hello from fitness-coach!
2.3. Setup Configuration Files
We use pydantic-settings to manage configuration via a settings.yaml file.
- Create a new file in the editor (File -> New Text File) and save it as settings.yaml.
- Paste the following configuration into the file:
GCLOUD_LOCATION: "us-central1"
GCLOUD_PROJECT_ID: "your_gcloud_project_id"
BACKEND_URL: "http://localhost:8081/chat"
STORAGE_BUCKET_NAME: "your_bucket_name"
DB_COLLECTION_NAME: "fitness-images"
Important: You must update the value of GCLOUD_PROJECT_ID with your actual project ID and STORAGE_BUCKET_NAME with the bucket name you created earlier.
This completes the environment setup. You are now ready to begin building your ADK agent and services.
3. Build the Agent using Google ADK and Gemini 2.5
This agent has already been fully set up using Google ADK and Gemini 2.5 in the fitness_manager_agent directory. Before testing, it’s essential to understand the core components of the Agent Development Kit (ADK) agent you have set up, as these define its capabilities and behavior. To learn more about ADK, visit the official documentation.
The main component is the fitness_manager_agent, which orchestrates the application’s core logic and uses the following elements:
3.1. Agent definition (agent.py)
The agent.py file initializes the main agent, giving it its identity and capabilities:
- Name & Model: The agent is named fitness_coach_agent and uses the powerful gemini-2.5-flash model, suitable for multimodal analysis and tool-use.
- Instruction: It loads its persona and core rules from the task_prompt.md file, which defines it as “Coach Buff”, a specialized Personal Fitness and Nutrition Coach. We set the files in a separate markdown file to keep our files clean.
- BuiltInPlanner: It utilizes the ADK’s BuiltInPlanner to ensure the model has adequate “thinking budget” to strategically decide which tools to use and when.
3.2. Configuring Agent Tools (tools.py)
The agent is equipped with four custom tools, defined in tools.py, which enable interaction with the external environment (Firestore) to perform tasks that the LLM cannot do alone:
| Tool Name | Purpose | Data Stored/Retrieved |
|---|---|---|
| log_meal_data | Logs nutritional details (macros, calories) extracted from text or meal images into Firestore. | Meal Type, Calories, Protein, Carbs, Fat, Image ID |
| analyze_form_and_store | Logs detailed feedback and correction tips extracted from an exercise image analysis. | Exercise Name, Main Feedback, Correction Tips, Image ID |
| create_workout_plan | Generates and saves a structured workout routine based on user goals. | Goal, Duration, List of Exercises |
| get_user_logs | Retrieves past meal logs, workout plans, or form feedback based on a specified time range. | Past log entries (Meal, Workout, or Feedback) |
3.3. Memory Management Callback (callbacks.py)
Google ADK enables us to “intercept” agent runtime at various levels. You can read more about this detailed capability in this documentation . The agent uses a special function, modify_image_data_in_history, as a before_model_callback.
- Role: This callback runs before every API call to Gemini. Its primary role is to manage the conversation history size, which can quickly grow large with many high-resolution images.
- Mechanism: It hashes the original image data to create a unique [IMAGE-ID <hash-id>] placeholder. It then prunes the actual, large image data from older user messages (keeping only the last 3 for immediate context) while ensuring the smaller placeholder text remains.
- Tool link: This placeholder allows the LLM to still refer to the image using the get_user_logs tool (as instructed in task_prompt.md) without incurring the cost or latency of sending the full image with every turn.
3.4 The Prompt
Designing an agent with complex interaction and capabilities, requires us to find a good enough prompt to guide the agent so that it can behave the way we want it to be. Previously we had a mechanism on how to handle image data in conversation history. We also want the agent to be able to search and retrieve the correct image to us. This means we need to properly convey all of this information in a proper prompt structure. We will ask the agent to structure the output into the following markdown format to parse the thinking process, final response, and attachment ( if any ). This already exists in the task_prompt.md file.
You can read more about prompt engineering resources here.
4. Testing the Agent
Now that the environment is configured and the architecture is understood, you can run the ADK agent for testing and debugging using both the command-line interface (CLI) and the local development UI.
4.1. CLI interaction
Run the following command from your terminal to communicate with the agent via the CLI. Note that this interface supports text-only input.
uv run adk run fitness_manager_agent
You will see output similar to this:
Log setup complete: /tmp/agents_log/agent.20251108_032723.log
To access latest log: tail -F /tmp/agents_log/agent.latest.log
Running agent fitness_coach_agent, type exit to exit.
user:
Try asking about its capabilities: “What can you help me with?“
4.2. Launching the local development UI
The ADK also provides a powerful development UI that lets you interact with the agent, inspect logs, and debug tool usage. Run the following command to start the local server:
uv run adk web --port 8080
You will see confirmation that the server has started:
INFO: Started server process [xxxx]
INFO: Waiting for application startup.
+-----------------------------------------------------------------------------+
| ADK Web Server started |
| |
| For local testing, access at http://localhost:8080. |
+-----------------------------------------------------------------------------+
INFO: Application startup complete.
INFO: Uvicorn running on [http://0.0.0.0:8080](http://0.0.0.0:8080) (Press CTRL+C to quit)
4.3. Accessing the web interface
Click the Web Preview button on the top area of your Cloud Shell Editor and select Preview on port 8080.
On the web page:
- Select the fitness_coach_agent from the top-left dropdown.
- Interact with the bot. The left window displays detailed logs and traces during the agent’s runtime.
4.4. Testing Tool Usage and Multimodal Analysis
To fully test the agent capabilities:
- Go to the images directory and download the jollof rice image.
- Upload the file to the bot by clicking the “clip” icon and try the following interactions - “Upload a meal image and say: “I had this for lunch today 5th of November . Log it for me””
When the agent uses its internal tools, you can inspect the function calls and results in the development UI’s log panel.
You now have a complete working development agent!
It’s time to complete the application by building the proper and nice UI and adding capabilities to upload and download the image file.
5. Building the Full Stack Application
Now we will integrate the agent into a full web application with a Gradio frontend and a FastAPI backend.
5.1. Build Frontend Service using Gradio
The frontend.py file in your project sets up the Gradio chat interface. It encodes user-uploaded images to base64 and sends them, along with the chat message and session ID, to the backend service. It then decodes and displays any image attachments returned by the agent.
- Open a new terminal session (Terminal -> New Terminal or Ctrl + Shift + C).
- Run the frontend service in this new terminal:
uv run frontend.py
- Access the web interface by clicking the Web Preview button on the top area of your Cloud Shell Editor and selecting Preview on port 8080.
Note: The application will display, but submissions will fail because the backend service is not yet running. Leave this terminal running the frontend.
Code explanation
The frontend.py file sets up the user interface using Gradio’s gr.ChatInterface. The core chat submission logic is contained within the get_response_from_llm_backend function, which manages the communication with the FastAPI backend:
- Image pre-processing: User-uploaded image files (which Gradio provides as temporary local paths) are processed by encode_image_to_base64_and_get_mime_type. This function reads the image bytes, converts them into a Base64 string, and determines its MIME type, packaging it as a ImageData object.
- Request construction: A ChatRequest Pydantic model is constructed. This payload includes the user_input, the list of Base64-encoded images, and the session_id (retrieved from the Gradio state to maintain chat continuity).
- Backend call: The JSON payload is sent to the backend’s /chat endpoint using the requests.post method.
- Response processing: The raw response is validated against the ChatResponse Pydantic model.
- The session ID is immediately extracted and saved back into the Gradio state.
- The agent’s internal thinking process is displayed as a separate gr.ChatMessage for developer debugging.
- Any image attachments returned by the agent (which are Base64 strings) are converted back into a PIL Image object using decode_base64_to_image and rendered using gr.Image.
5.2. Build Backend Service using FastAPI
The backend.py file uses FastAPI to define the /chat endpoint. It initializes the ADK Runner, InMemorySessionService, and GcsArtifactService during startup. For every request, it runs the fitness_manager_agent, handles image storage in GCS, and retrieves image attachments before sending the final response to the frontend.
- Open a third terminal session (Terminal -> New Terminal or Ctrl + Shift + C).
- Run the backend service in this third terminal:
uv run backend.py
Note: The backend service runs on port 8081.
Code explanation
The backend.py file is the ADK orchestration layer, exposing the /chat endpoint using FastAPI on port 8081.
- Service Initialization (Lifespan): The lifespan function executes during application startup, initializing all necessary ADK services only once:
- InMemorySessionService: Used to manage conversation history and state per session.
- GcsArtifactService: Configured with the GCS bucket name (SETTINGS.STORAGE_BUCKET_NAME) to handle large multimodal data (images).
- Runner: The core ADK component that orchestrates the fitness_coach_agent, linking it to the session and artifact services.
- The /chat Endpoint Flow: This endpoint handles the primary request processing:
- Artifact Storage: The function format_user_request_to_adk_content_and_store_artifacts (executed asynchronously in a separate thread via asyncio.to_thread) receives the Base64 images from the frontend. It decodes them, stores the raw image files in GCS, and generates a corresponding hash-ID placeholder for the ADK Event content.
- Agent Execution: The app_contexts.agent_runner.run(event) method is called. This triggers the agent’s full lifecycle (planner execution, tool selection, model calling, and response stream generation).
- Response Assembly & Post-Processing: The response, received as an asynchronous stream, is assembled into a full_response string. Utility functions then handle final data extraction:
- extract_attachment_ids_and_sanitize_response: Parses the raw text for the special ADK image attachment tag ([IMAGE-ID <hash-id>]) that the agent includes when it decides to return an image.
- extract_thinking_process: Separates the internal # THINKING PROCESS block from the final user response.
- Attachment Retrieval: For any extracted attachment_ids, the function download_image_from_gcs retrieves the corresponding raw image bytes from GCS, Base64 encodes them, and prepares them to be sent back to the Gradio frontend in the final ChatResponse.
5.3. Integration Test
You should now have multiple services running in different Cloud Shell terminal tabs:
- Front-end service runs at port 8080.
- Backend service runs at port 8081.
- Access the frontend application again by clicking the Web Preview button on the top area of your Cloud Shell Editor and selecting Preview on port 8080.
- At this point, you should be able to upload your images and chat seamlessly with the assistant from the web application.
Testing Scenarios:
- Go to the images directory and download the images.
- Upload the file to the bot by clicking the “clip” icon and try the following:
- Meal Logging (Multimodal): Upload a meal image and say: “I had this for lunch today. Log it for me.”
- Form Analysis (Multimodal): Upload an exercise image and say: “How is my squat form?”
- Retrieval (RAG): After logging, ask: “What meals did I log yesterday?” or “Show me the feedback I received on my squat.”
When the agent uses its internal tools, you can inspect the function calls and results in the Gradio chat window (if you enabled the thinking process) and the backend terminal logs.
You now have a complete working development agent!
6. Deploying to Cloud Run
To make this amazing app accessible from anywhere, we will package the application and deploy it to Cloud Run. The goal is to manage both the frontend (:8080) and backend (:8081) services within a single container. We will need the help of supervisord to manage both services. You can inspect the supervisord.conf file and check the Dockerfile that we set the supervisord as the entrypoint.
- Open a new terminal.
- Ensure the current project is configured to your active project. If not, use the command:
gcloud config set project [PROJECT_ID]
- Run the following command to deploy the application to Cloud Run:
gcloud run deploy fitness-coach-assistant \
--source . \
--port=8080 \
--allow-unauthenticated \
--env-vars-file=settings.yaml \
--memory 1024Mi \
--region us-central1
- If you are prompted to acknowledge the creation of an Artifact Registry for the Docker repository in us-central-1, answer Y.
- Once the deployment is complete, you will receive a public URL similar to this: https://fitness-coach-assistant-…us-central1.run.app.
You can now access your live application from any browser or mobile device!