Gemini is the latest and most advanced AI model developed by Google. This AI model is capable of developing high-quality and accurate responses to users’ queries. The stand-out part about Gemini is that it is capable of understanding and working with different types of data effortlessly such as Images, texts, codes, videos, and even audio.
In this article, we are going to take a look at Gemini API and learn how to set it up on your device. We have mentioned a step-by-step guide on how to access and use Gemini API for free by following simple methods. So, let’s begin.
Introducing Gemini AI Models
Gemini is the latest AI model launched by Google in collaboration with Google Research and Google DeepMind. This model represents a crucial step forward in the AI world showcasing its advanced capabilities and features. It is built to be multimodal which means it can easily understand and work with numerous types of data such as text, images, code, audio, and videos. Gemini is considered the largest and most advanced AI model to be developed by Google. This AI model has been made available in three different sizes by Google for unmatched versatility.
- Gemini Ultra: This size is the most capable model for large-scale and is capable of performing a wide range of complex tasks effortlessly.
- Gemini Pro: It is an excellent performing model with advanced features for a variety of texts and image reasoning tasks.
- Gemini Nano: This is a perfect model for on-device experiences which can enable offline use cases. This model is capable of leveraging device processing power at no cost.
How to Access and Use Gemini API for Free
You can use Gemini API for free by generating an API Key for yourself and accessing Deepnote. Below we have mentioned a step-by-step guide on how to access and use Gemini API for free:
Setting up Gemini API
- The first step in the setup process of Gemini API is to generate an API Key by visiting this URL https://ai.google.dev/tutorials/setup
- After visiting the link, you have to click on “Get an API Key”
- You will be navigated to another page where you have to click on “Create an API key in a new project”
- An API Key will be generated for you which you can copy and set as your environment variable. After this, we will be accessing Deepnote as it helps users easily set the key with the name “GEMINI_API_KEY”. You need to navigate to integration, scroll down, and then choose environment variables.
Once this is done, it’s time to install python API using PIP:
pip install -q -U google-generativeai
Next, you have to set the API Key on Google’s GenAI and begin the instance.
import google.generativeai as genai
import os
gemini_api_key = os.environ[“GEMINI_API_KEY”]
genai.configure(api_key = gemini_api_key)
Using Gemini Pro
Once you have successfully set up the API Key, the process of creating content on the Gemini Pro model becomes quite simple and easy. You need to provide a prompt to the “generate_content” function and then display the output as Markdown.
from IPython.display import Markdown
model = genai.GenerativeModel(‘gemini-pro’)
response = model.generate_content(“Who is the GOAT in the NBA?”)
Markdown(response.text)
Another great capability of Gemini is that it can create numerous responses using a single prompt which is known as “Candidates.” Users can choose the most suitable ones for themselves.
response.candidates
You can now ask the model to write down a simple game in Python using the below-mentioned prompt:
response = model.generate_content(“Build a simple game in Python”)
Markdown(response.text)
The model will then instantly create a result for you. Unlike other large language models, begin to explain the code instead of writing it down.
Configuring the Response
Users can efficiently customize their responses on the platform by utilizing a simple `generation_config` argument. The candidate count is limited to 1, adding the word “space” as a stop word, and then setting the max tokens along with the temperature.
response = model.generate_content(
‘Write a short story about aliens.’,
generation_config=genai.types.GenerationConfig(
candidate_count=1,
stop_sequences=[‘space’],
max_output_tokens=200,
temperature=0.7)
)
Markdown(response.text)
Now, you will witness the responses instantly stopping before the word “space.”
Streaming Response
For Streaming Response, users can utilize the `Stream` argument. This is quite similar to OpenAI APIs and Anthropic but more fast and quick.
model = genai.GenerativeModel(‘gemini-pro’)
response = model.generate_content(“Write a Julia function for cleaning the data.”, stream=True)
for chunk in response:
print(chunk.text)
Using Gemini Pro Vision
Here, we will load Masood Aslam’s image for testing the multimodality of Gemini Pro Vision.
For this, we have to load the pictures to the `PIL` and then display it.
import PIL.Image
img = PIL.Image.open(‘images/photo-1.jpg’)
img
Here we have a high-resolution image of Rua Augusta Arch.
After this, load the Gemini Pro Vision model and provide it with the photo.
model = genai.GenerativeModel(‘gemini-pro-vision’)
response = model.generate_content(img)
Markdown(response.text)
Chat Conversations Session
Another excellent feature that you can enable is setting up the model to have a back-and-forth conversation session. By enabling this, the context and responses generated from the previous conversations will be remembered by the model.
Here, we begin the conversation session with the model and ask them to assist in getting
started with the Dota 2 game.
model = genai.GenerativeModel(‘gemini-pro’)
chat = model.start_chat(history=[])
chat.send_message(“Can you please guide me on how to start playing Dota 2?”)
chat.history
You will witness the `chat` will begin saving the history and mode chat.
Based on your preferences, you can also display it in the Markdown method which is mentioned below:
for message in chat.history:
display(Markdown(f’**{message.role}**: {message.parts[0].text}’))
Once done, you can move forward and ask a follow up question to the model:
chat.send_message(“Which Dota 2 heroes should I start with?”)
for message in chat.history:
display(Markdown(f’**{message.role}**: {message.parts[0].text}’))
Then, you can scroll down and witness the entire chat session with the model effortlessly.
Using Embeddings
Recently, embedding models have gained more and more popularity among users due to their context-aware applications. The Gemini embedding-001 model provides excellent capabilities such as words, sentences, or complete documents to be represented as a dense vector that could encode semantic meaning.
The vector representation makes it easy to compare the similarities between different pieces of text by comparing the corresponding embedding vectors.
Users need to provide the content to `embed_content` and transform the text into embedding and that’s it.
output = genai.embed_content(
model=”models/embedding-001″,
content=”Can you direct me on how to begin accessing Dota 2?”,
task_type=”retrieval_document”,
title=”Embedding of Dota 2 question”)
print(output[’embedding’][0:10])
[0.060604308, -0.023885584, -0.007826327, -0.070592545, 0.021225851, 0.0432290
Users can also convert numerous chunks of text into embedding by simply passing out the strings list into the ‘content’ argument.
output = genai.embed_content(
model=”models/embedding-001″,
content=[
“Can you direct me on how to begin accessing Dota 2?”,
“Which Dota 2 heroes should I start with?”,
],
task_type=”retrieval_document”,
title=”Embedding of Dota 2 question”)
for emb in output[’embedding’]:
print(emb[:10])
[0.060604308, -0.023885584, -0.007826327, -0.070592545, 0.021225851, 0.043229062, 0.06876691, 0.049298503, 0.039964676, 0.08291664]
[0.04775657, -0.044990525, -0.014886052, -0.08473655, 0.04060122, 0.035374347,
Conclusion
Google’s latest AI model Gemini is definitely an excellent creation by the company considering its advanced features and capabilities. In this tutorial, we have discussed a step-by-step guide on how you can access and use Gemini API for free and generate useful responses instantly. In addition, we have also mentioned using Gemini Pro vision and using embedding.