Constructing a Meals Imaginative and prescient WebApp with the Gemini Flash 1.5 Mannequin

Introduction

On this fast-changing panorama of AI, effectivity and scalability grow to be paramount. Builders are actively reaching out to these fashions that present excessive efficiency at a diminished price with decrease latency and higher scalability. Enter Gemini Flash 1.5 a brand new launch that retains all the good options of Gemini 1.1 and gives even higher efficiency for a lot of image-related duties. Particularly, As part of the Gemini 1.5 launch, which additionally consists of the Gemini 1.5 Professional variant, Flash 1.5 stands out as a mannequin to make quick, environment friendly, and high-volume duties doable. Now, let’s contemplate the significance of Gemini Flash 1.5 on this weblog and make a Meals Imaginative and prescient WebApp with Flask.

Studying Outcomes

  • Perceive the important thing options and efficiency enhancements of Gemini Flash 1.5.
  • Discover ways to combine and use the Gemini Flash 1.5 mannequin in a Flask net utility.
  • Acquire insights into the significance of light-weight AI fashions for high-volume duties.
  • Uncover the method of making a Meals Imaginative and prescient WebApp utilizing Flask and Gemini Flash 1.5.
  • Discover the steps for configuring and utilizing Google AI Studio’s Gemini Flash 1.5.
  • Determine the advantages of utilizing JSON schema mode for structured AI mannequin outputs.

This text was printed as part of the Information Science Blogathon.

Gemini Flash 1.5

Want for Light-weight AI Fashions

With the combination of AI into completely different industries, quick and environment friendly fashions to course of excessive quantities of knowledge are subsequently wanted. Conventional AI fashions are very resource-intensive, often excessive in latency, and low in scaling. This creates an enormous problem, particularly to builders engaged on purposes that require real-time responses or that are field-deployed on resource-constrained environments comparable to cell gadgets or edge computing platforms.

Recognizing these challenges, Google launched the Gemini Flash 1.5 mannequin—a light-weight AI resolution tailor-made to satisfy the wants of contemporary builders. Gemini Flash 1.5 is designed to be cost-efficient, quick, and scalable, making it a perfect alternative for high-volume duties the place efficiency and price are vital issues.

Key Options of Gemini Flash 1.5

  • Enhanced Efficiency and Scalability: One of the vital important updates in Gemini Flash 1.5 is its give attention to efficiency and scalability. Google has elevated the speed restrict for Gemini Flash 1.5 to 1000 requests per minute (RPM), a considerable enchancment that permits builders to deal with extra important workloads with out compromising on velocity. Moreover, the elimination of the day by day request restrict additional enhances its usability, enabling steady processing with out interruptions.
  • Tuning Assist: Customization and flexibility are key elements of profitable AI implementations. To help this, Google is rolling out tuning help for Gemini Flash 1.5, permitting builders to fine-tune the mannequin to satisfy particular efficiency thresholds. Tuning is supported each in Google AI Studio and straight by way of the Gemini API. This function is especially worthwhile for builders trying to optimize the mannequin for area of interest purposes or particular knowledge units. Importantly, tuning jobs are freed from cost, and utilizing a tuned mannequin doesn’t incur further per-token prices, making it a pretty choice for cost-conscious builders.
Google gemini
  • JSON Schema Mode: One other notable function in Gemini Flash 1.5 is the introduction of JSON schema mode. This mode offers builders extra management over the mannequin’s output by permitting them to specify the specified JSON schema. This flexibility is essential for purposes that require structured output, comparable to knowledge extraction, API responses, or integration with different programs. By conforming to a specified schema, Gemini Flash 1.5 may be seamlessly built-in into current workflows, enhancing its versatility.

Getting Began with Flask

Flask is a light-weight micro net framework that permits builders to construct net purposes utilizing Python. It’s referred to as a “micro” framework as a result of it doesn’t require a number of setup or configuration, in contrast to different frameworks like Django. Flask is ideal for constructing small to medium-sized net purposes, prototyping, and even large-scale purposes with the appropriate structure.

Key Options of Flask

  • Light-weight: Flask has a small codebase and doesn’t require a number of dependencies, making it straightforward to study and use.
  • Versatile: Flask can be utilized for constructing a variety of net purposes, from easy net pages to advanced net providers.
  • Modular: Flask has a modular design, making it straightforward to increase and customise.
  • Unit Testing: Flask has built-in help for unit testing, making it straightforward to put in writing and run assessments.

Getting Began with Flask

Flask is a light-weight micro net framework that permits builders to construct net purposes utilizing Python. It’s excellent for constructing small to medium-sized net purposes, prototyping, and even large-scale purposes with the appropriate structure.

Key Options of Flask

  • Light-weight: Small codebase with minimal dependencies, straightforward to study and use.
  • Versatile: Appropriate for a variety of net purposes.
  • Modular: Straightforward to increase and customise.
  • Unit Testing: Constructed-in help for unit testing.

Flask App Instance

from flask import Flask

app = Flask(__name__)

@app.route("https://www.analyticsvidhya.com/")
def hello_world():
    return "<p>Hey, World!</p>"
    
if __name__ == "__main__":
    app.run(debug=True)
terminal

Output:

 Output

Learn the Flask Documentation for extra particulars

Meals Imaginative and prescient WebApp: Overview of Venture Group

The Meals Imaginative and prescient WebApp is organized into a number of key elements: a digital surroundings folder (myenv/), static information for frontend belongings (static/), HTML templates (templates/), and a important utility file (app.py). The .env file shops delicate configuration particulars. This construction ensures a clear separation of considerations, making the mission simpler to handle and scale.

Folder Construction

This part outlines the folder construction of the Meals Imaginative and prescient WebApp, detailing the place varied elements are positioned. Understanding this group is essential for sustaining and increasing the appliance effectively.

myenv/             # folder for digital surroundings
│
static/            # Folder for static information
│   ├── scripts.js
│   └── kinds.css
│
templates/         # Folder for HTML templates
│   └── index.html
│
.env               # Setting variables file
app.py             # Essential utility file

Create a Digital Setting

Making a digital surroundings ensures that your mission dependencies are remoted from the worldwide Python surroundings. Observe these steps to arrange and activate a digital surroundings for the Meals Imaginative and prescient WebApp.

python -m venv myenv

Activate in Home windows (Command Immediate)

.myenvScriptsactivate

Activating in Home windows (PowerShell)

.myenvScriptsActivate.ps1

Activate in macOS/Linux (Bash/Zsh)

supply myenv/bin/activate

Set up these Dependencies 

Set up the required Python packages to run the Meals Imaginative and prescient WebApp successfully. These dependencies embrace libraries for net improvement, picture processing, and surroundings administration.

pip set up google-generativeai	
pip set up flask
pip set up pillow
pip set up python-dotenv

HTML Template: Designing the Consumer Interface

The HTML template gives the construction for the Meals Imaginative and prescient WebApp’s front-end. This part covers the format, file add kind, and placeholders for displaying the uploaded picture and outcomes.

<!-- templates/index.html -->
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta identify="viewport" content material="width=device-width, initial-scale=1.0">
    <title>Nutrify</title>
    <hyperlink rel="stylesheet" href="https://www.analyticsvidhya.com/weblog/2024/08/building-a-food-vision-webapp-with-the-gemini-flash-1-5-model/{{ url_for("static', filename="kinds.css") }}">
    <script src="https://www.analyticsvidhya.com/weblog/2024/08/building-a-food-vision-webapp-with-the-gemini-flash-1-5-model/{{ url_for("static', filename="scripts.js") }}" defer></script>
</head>
<!-- templates/index.html -->
    <physique>
        <div class="container">
            <div class="upload-section">
                <div class="upload-form">
                    <h2>Add a file</h2>
                    <p>Connect the file under</p>
                    <kind id="uploadForm" methodology="put up" enctype="multipart/form-data">
                        <div class="upload-area" id="uploadArea">
                            <enter sort="file" id="uploadInput" identify="uploadInput" settle for=".jpg, .jpeg, .png" required>
                            <label for="uploadInput">Drag file(s) right here to add.<br>Alternatively, you possibly can choose a file by <a href="#" onclick="doc.getElementById('uploadInput').click on(); return false;">clicking right here</a></label>
                        </div>
                        <div id="fileName" class="file-name"></div>
                        <button sort="submit" id="submitBtn">Add File</button>
                    </kind>
                    <div id="loadingIndicator" fashion="show: none;">
                        <div class="spinner"></div>
                        <p>Loading...</p>
                    </div>
                </div>
                <div id="imageDisplay" class="image-display"></div>
            </div>
            <div id="responseOutput" class="response-output"></div>
        </div>
  
    
</physique>
</html>

CSS: Styling the WebApp

The CSS file enhances the visible presentation of the Meals Imaginative and prescient WebApp. It consists of kinds for format, buttons, loading indicators, and responsive design to make sure a seamless consumer expertise.

physique {
    font-family: 'Roboto', sans-serif;
    background-color: #f4f4f4;
    margin: 0;
    padding: 0;
    coloration: #333;
    overflow-y: auto; /* Permits scrolling as wanted */
    min-height: 100vh; /* Ensures at the very least full viewport peak */
    show: flex;
    flex-direction: column; /* Adjusts route for content material circulation */
}
.center-container {
    show: flex;
    align-items: middle;
    justify-content: middle;
    flex-grow: 1; /* Permits the container to increase */
}

.container {
    show: flex;
    flex-direction: column;
    justify-content: middle;
    align-items: middle;
    width: 100%;
    max-width: 100%;
    padding: 20px;
    background-color: #fff;
    box-shadow: 0 5px 15px rgba(0, 0, 0, 0.1);
    border-radius: 8px;
    flex-grow: 1;
    box-sizing: border-box; /* Add this line */
}

.upload-section {
    show: flex;
    width: 100%;
    justify-content: space-between;
    align-items: flex-start;
    margin-bottom: 20px;
}

.upload-form {
    width: 48%;
}

.image-display {
    width: 48%;
    text-align: middle;
}

h2 {
    coloration: #444;
    margin-bottom: 10px;
}

p {
    margin-bottom: 20px;
    coloration: #666;
}

/* Add space kinds */
.upload-area {
    border: 2px dashed #ccc;
    border-radius: 8px;
    padding: 20px;
    margin-bottom: 20px;
    cursor: pointer;
}

.upload-area enter[type="file"] {
    show: none;
}

.upload-area label {
    show: block;
    coloration: #666;
    cursor: pointer;
}

.upload-area a {
    coloration: #007bff;
    text-decoration: none;
}

.upload-area a:hover {
    text-decoration: underline;
}

.file-name {
    margin-bottom: 20px;
    font-weight: daring;
    coloration: #444;
}

/* Button kinds */
button {
    padding: 10px 20px;
    border: none;
    border-radius: 8px;
    cursor: pointer;
    font-size: 1em;
    transition: background-color 0.3s ease, remodel 0.2s ease;
    background-color: #007bff;
    coloration: #fff;
}

button:hover {
    background-color: #0056b3;
    remodel: translateY(-2px);
}

/* Loading indicator kinds */
#loadingIndicator {
    show: none;
    text-align: middle;
    margin-top: 20px;
}

.spinner {
    border: 4px stable rgba(0, 0, 0, 0.1);
    border-top: 4px stable #007bff;
    border-radius: 50%;
    width: 40px;
    peak: 40px;
    animation: spin 1s linear infinite;
    margin: 0 auto;
}

@keyframes spin {
    0% { remodel: rotate(0deg); }
    100% { remodel: rotate(360deg); }
}

/* Picture show kinds */
#imageDisplay img {
    max-width: 100%;
    peak: auto;
    border-radius: 8px;
    box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
}

/* Response output kinds */
.response-output {
    width: 100%;
}

#responseOutput {
    text-align: left;
    margin-top: 20px;
}

#responseOutput h2 {
    coloration: #333;
    margin-bottom: 10px;
    font-size: 1.5em;
}

#responseOutput pre {
    white-space: pre-wrap;
    padding: 10px;
    background-color: #f9f9f9;
    border: 1px stable #ddd;
    border-radius: 8px;
    font-size: 1em;
}

Flask Software (app.py)

The app.py file powers the Meals Imaginative and prescient WebApp by managing routes and dealing with picture uploads. It integrates with the Gemini Flash 1.5 mannequin to supply dietary evaluation and responses.

Step1: Setting Up Important Libraries

This part imports the mandatory libraries and modules for the Flask utility. These embrace Flask for net improvement, google.generativeai for interacting with the Gemini API, and PIL for picture processing.

from flask import Flask,render_template,request, redirect, url_for,jsonify
import google.generativeai as genai
from PIL import Picture
import base64
import io
import os

Step2: Gemini API Configuration

Right here, you configure the Gemini AI library utilizing your API key. This setup ensures that the appliance can talk with the Gemini API to course of picture knowledge and generate dietary data.

my_api_key_gemini = os.getenv('GOOGLE_API_KEY')
genai.configure(api_key=my_api_key_gemini)

Step3: Getting the API Key

Receive your API key from the Google AI Studio. This key’s essential for authenticating requests to the Gemini API.

Go to Google AI Studio right here and get your API key.

AI Studio: Gemini Flash 1.5

Step4: Retailer Your API key in .env file

Save your API key in a .env file to maintain it safe and simply accessible. The applying retrieves the important thing from this file to configure the Gemini API.

GOOGLE_API_KEY="Your_API_KEY"
  • my_api_key_gemini = os.getenv(‘GOOGLE_API_KEY’): This retrieves your Google API key from an surroundings variable named GOOGLE_API_KEY.
  • genai.configure(api_key=my_api_key_gemini): This configures the Gemini AI library to make use of your API key for making requests.

Step5: Creating routes

On this step, you create the routes for the Flask utility. These routes deal with requests and responses, together with rendering the homepage and processing file uploads.

app = Flask(__name__)

@app.route("https://www.analyticsvidhya.com/")
def index():
    return render_template('index.html')

Step6: Creating Flask Route

Making a well-structured Flask route for dealing with a picture add, processing it, and sending it to the Gemini Flash 1.5.

@app.route('/add', strategies=['POST'])
def add():
    uploaded_file = request.information['uploadInput']
    
    if uploaded_file:
        picture = Picture.open(uploaded_file)
        
        # Guarantee appropriate mime sort based mostly on file extension
        if uploaded_file.filename.endswith('.jpg') or uploaded_file.filename.endswith('.jpeg'):
            mime_type="picture/jpeg"
        elif uploaded_file.filename.endswith('.png'):
            mime_type="picture/png"
        else:
            return jsonify(error="Unsupported file format"), 400
        
        # Encode picture to base64 for sending to API
        buffered = io.BytesIO()
        picture.save(buffered, format=picture.format)
        encoded_image = base64.b64encode(buffered.getvalue()).decode('utf-8')

        image_parts = [{
            "mime_type": mime_type,
            "data": encoded_image
        }]
        
        input_prompt = """
            You're an skilled in nutritionist the place it is advisable to see the meals objects from the picture
            and calculate the whole energy, additionally present the main points of each meals objects with energy consumption
            is under format

            1. Merchandise 1 - no of energy, protein
            2. Merchandise 2 - no of energy, protein
            ----
            ----
            Additionally point out illness danger from this stuff
            Lastly you too can point out whether or not the meals objects are wholesome or not and Counsel Some Wholesome Different 
            is under format          
            1. Merchandise 1 - no of energy, protein
            2. Merchandise 2 - no of energy, protein
            ----
            ----
        """

        # Simulate API response 
        model1 = genai.GenerativeModel('gemini-1.5-flash')
        response = model1.generate_content([input_prompt, image_parts[0]])
        end result = response.textual content

        return jsonify(end result=end result, picture=encoded_image)
    
    return jsonify(error="No file uploaded"), 400

Step7: Working the Software

Execute the Flask app with app.run(debug=True) to begin the server. This gives an area improvement surroundings the place you possibly can check and debug the appliance.

from flask import Flask,render_template,request, redirect, url_for,jsonify
import google.generativeai as genai
from PIL import Picture
import base64
import io
import os
my_api_key_gemini = os.getenv('GOOGLE_API_KEY')
genai.configure(api_key=my_api_key_gemini)
app = Flask(__name__)

@app.route("https://www.analyticsvidhya.com/")
def index():
    return render_template('index.html')

@app.route('/add', strategies=['POST'])
def add():
    uploaded_file = request.information['uploadInput']
    
    if uploaded_file:
        picture = Picture.open(uploaded_file)
        
        # Guarantee appropriate mime sort based mostly on file extension
        if uploaded_file.filename.endswith('.jpg') or uploaded_file.filename.endswith('.jpeg'):
            mime_type="picture/jpeg"
        elif uploaded_file.filename.endswith('.png'):
            mime_type="picture/png"
        else:
            return jsonify(error="Unsupported file format"), 400
        
        # Encode picture to base64 for sending to API
        buffered = io.BytesIO()
        picture.save(buffered, format=picture.format)
        encoded_image = base64.b64encode(buffered.getvalue()).decode('utf-8')

        image_parts = [{
            "mime_type": mime_type,
            "data": encoded_image
        }]
        
        input_prompt = """
            You're an skilled in nutritionist the place it is advisable to see the meals objects from the picture
            and calculate the whole energy, additionally present the main points of each meals objects with energy consumption
            is under format

            1. Merchandise 1 - no of energy, protein
            2. Merchandise 2 - no of energy, protein
            ----
            ----
            Additionally point out illness danger from this stuff
            Lastly you too can point out whether or not the meals objects are wholesome or not and Counsel Some Wholesome Different 
            is under format          
            1. Merchandise 1 - no of energy, protein
            2. Merchandise 2 - no of energy, protein
            ----
            ----
        """

        # Simulate API response (change with precise API name)
        model1 = genai.GenerativeModel('gemini-1.5-flash')
        response = model1.generate_content([input_prompt, image_parts[0]])
        end result = response.textual content

        return jsonify(end result=end result, picture=encoded_image)
    
    return jsonify(error="No file uploaded"), 400

if __name__ == "__main__":
    app.run(debug=True)         

Output:

The output can be a JSON response containing the dietary evaluation and well being suggestions based mostly on the uploaded meals picture. The evaluation consists of particulars like energy, protein content material, potential well being dangers, and ideas for more healthy alternate options.

output: Gemini Flash 1.5
output: Gemini Flash 1.5

Get the code from my GitHub Repo: right here.

Conclusion

Gemini Flash 1.5 advances the state of AI fashions by addressing core necessities with enhanced velocity, effectivity, and scalability. It goals to satisfy the calls for of right this moment’s fast-moving digital world. Armed with fairly a couple of highly effective efficiency options, versatile tuning help, and broadened scope in textual content, picture, and structured knowledge duties, Gemini Flash 1.5 empowers builders to construct extremely inventive AI options with energy and cost-effectiveness. It’s light-weight, excessive in quantity for processing; therefore, it serves as an excellent alternative for real-time cell apps and enormous enterprise programs.

Key Takeaways

  • Gemini Flash 1.5 optimizes high-volume duties. It options quick processing, with as many as 1000 requests per minute, making it fairly ultimate for purposes that require real-time responses.
  • It gives tuning help in order that builders can additional tune the mannequin to satisfy particular necessities with out incurring further price, therefore making it adaptable to be used circumstances.
  • It now helps textual content, JSON, and pictures, so Gemini Flash 1.5 can do every part from picture classification to structured knowledge output.
  • Google AI Studio gives an accessible platform for integrating and managing Gemini Flash 1.5, with options like JSON schema mode and cell help enhancing the general developer expertise.
  • The elimination of the day by day request restrict and the power to deal with a lot of requests per minute make Gemini Flash 1.5 appropriate for scalable purposes, from cell apps to large-scale enterprise options.

Incessantly Requested Questions 

Q1. What’s Gemini Flash 1.5?

A. Gemini Flash 1.5 is a light-weight, cost-efficient AI mannequin developed by Google, optimized for high-volume duties with low latency. It’s a part of the Gemini 1.5 launch, alongside the Gemini 1.5 Professional variant.

Q2. How does Gemini Flash 1.5 differ from Gemini 1.5 Professional?

A. Gemini Flash 1.5 is designed for sooner and more cost effective processing, making it ultimate for high-volume duties. Whereas each fashions share similarities, Flash 1.5 optimizes velocity and scalability for situations the place these components are vital.

Q3. What are the important thing options of Gemini Flash 1.5?

A. Key options embrace enhanced efficiency with 1000 requests per minute, tuning help for personalization, JSON schema mode for structured outputs, and cell help with mild mode in Google AI Studio.

Q4. Can I fine-tune the Gemini Flash 1.5 mannequin?

A. Sure, tuning help is out there for Gemini Flash 1.5, permitting you to customise the mannequin in accordance with your particular wants. Tuning is presently freed from cost, with no further per-token prices.

Q5. Does Gemini Flash 1.5 help picture processing?

A. Sure, Gemini Flash 1.5 helps picture processing, making it appropriate for duties comparable to picture classification and object detection, along with textual content and JSON dealing with.

The media proven on this article isn’t owned by Analytics Vidhya and is used on the Creator’s discretion.

Leave a Reply

Your email address will not be published. Required fields are marked *