ADaSci Banner 2024

Hands-on Guide to CodeGemma: An AI-Powered Coding Assistant by Google

Google's CodeGemma boosts developer productivity with AI-driven coding automation and intelligent code suggestions.
CodeGemma

Google has unveiled CodeGemma, an innovative suite of AI models designed to revolutionize the coding experience. Built on the advanced architecture of the Gemma family, CodeGemma aims to significantly boost developer productivity by automating complex coding tasks and providing intelligent code suggestions. Whether you’re a seasoned programmer or just starting, CodeGemma offers powerful tools to streamline your workflow, enhance code quality, and make coding more intuitive and efficient. In this article, we will understand CodeGemma and its benefits and see its implementation for code generation.

Table of Contents

  1. What is CodeGemma?
  2. Variants of CodeGemma
  3. Benefits of Using CodeGemma
  4. Implementation of CodeGemma

Deep dive into what CodeGemma is and what are the benefits of CodeGemma below. We will also be implementing CodeGemma.

What is CodeGemma?

CodeGemma is a platform that has set its own identity in the field of AI for its innovative approach to coding. The main purpose of CodeGemma is to help and empower individuals with the skills that are needed to develop in the world of fast-evolving AI. CodeGemma has quickly gained recognition for its cutting-edge curriculum and hands-on learning experiences. It consists of several AI models specifically trained to understand and generate code. These models are designed to help with tasks such as:

  1. Used for predicting and completing a half-completed code
  2. Generates a new code based on given instructions or existing code
  3. Understanding and translating human language to code
  4. Handling mathematical logic within code.

Variants of CodeGemma

CodeGemma offers three main variants for different uses:

7B Code Pre-trained Model

The most effective way to finish and generate code using given code snippets is with this model. It uses a sizable code dataset for training. It can predict what will happen next in a piece of code with good proficiency.

7B Instruction-Tuned Model

This variant is fine-tuned to understand natural language instructions. It makes it easier to interact with the AI using plain English. We can even ask it to write code or explain a code.

2B Model

This model is used for faster and more efficient code completion and generation. This model is optimized for quicker performance which makes it suitable for real-time code suggestion.

Source: Original Report

Benefits of Using CodeGemma

Enhanced Productivity

CodeGemma allows developers to focus on more complex and creative aspects of development by automating routine coding tasks.

Error Reduction

CodeGemma’s models generate code that is not only correct but also semantically meaningful. But these depend on the prompt we give. We have to make sure that our prompts make sense.

Multi-Language Support

CodeGemma supports various programming languages, including Python, JavaScript, Java, and many others. This makes it a versatile tool for developers working in different environments.

Learning Aid

This model can help beginners learn to code by providing examples and explanations, making the learning process more interactive and intuitive. 

In a world where digital literacy is becoming increasingly important, CodeGemma has become a helpful tool by helping in coding. 

Implementation of CodeGemma

As we know, CodeGemma helps generate code, complete half-completed code, or generate code based on natural language instructions. CodeGemma has its HuggingFace integration. We will use Google’s CodeGemma-2b model to complete the task we want. 

Use a Colab runtime with sufficient resources, such as a T4 GPU, to run the models efficiently.

Here, we will look into the use cases of CodeGemma by implementing it. To begin with, we will be importing the required libraries.

from transformers import GemmaTokenizer, AutoModelForCausalLM
from huggingface_hub import notebook_login

Login to the Hugging Face hub by using notebook_login. It will ask for our HuggingFace Token ID.

notebook_login()

Use the preset configurations to load the CodeGemma model into your environment and start using it for tasks like code completion and generation.

model_id = "google/CodeGemma-2b"
tokenizer = GemmaTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

Now, give a prompt that will be used to generate or complete the code. This prompt is passed through the tokenizer, and output is generated with the help of the model.

prompt = '''\
Write a code to find if 343 is even or odd\
'''
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
prompt_len = inputs["input_ids"].shape[-1]
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0][prompt_len:]))

Thus, CodeGemma suggested a code that would help us determine whether a given number is even or odd. 

We will now use CodeGemma to generate an SQL query that joins two tables and gives the final output table, as asked. 

prompt = """
Write SQL query to print the dept_Name and Std_Name from below tables.\
There are 2 tables named Department(dept_id, dept_name) and Student(std_id, std_name, dept_id)\
What query will you use to get the output?
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
prompt_len = inputs["input_ids"].shape[-1]
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0][prompt_len:]))

The output will be something like this:

Thus, using CodeGemma, we could generate code and an SQL query. However, it is important to remember that the output depends on the prompt and the model used. The prompt has to be very clear, and we might have to rerun it multiple times after changing it. 

Conclusion

CodeGemma is a significant step forward in AI-driven code assistance. By providing advanced models that can understand and generate code, Google is opening up new possibilities for developers to enhance their workflows and improve their coding efficiency. Whether you are a seasoned developer or a beginner, CodeGemma offers tools that can help us code smarter and faster.

References

  1. Link to the above code
  2. CodeGemma: HuggingFace Documentation
  3. CodeGemma: Google Original Document

Enroll in the following course to understand more about the GenAI applications with Google

Picture of Shreepradha Hegde

Shreepradha Hegde

Shreepradha is an accomplished Associate Lead Consultant at AIM, showcasing expertise in AI and data science, specifically Generative AI. With a wealth of experience, she has consistently demonstrated exceptional skills in leveraging advanced technologies to drive innovation and insightful solutions. Shreepradha's dedication and strategic mindset have made her a valuable asset in the ever-evolving landscape of artificial intelligence and data science.

The Chartered Data Scientist Designation

Achieve the highest distinction in the data science profession.

Elevate Your Team's AI Skills with our Proven Training Programs

Strengthen Critical AI Skills with Trusted Generative AI Training by Association of Data Scientists.

Our Accreditations

Get global recognition for AI skills

Chartered Data Scientist (CDS™)

The highest distinction in the data science profession. Not just earn a charter, but use it as a designation.

Certified Data Scientist - Associate Level

Global recognition of data science skills at the beginner level.

Certified Generative AI Engineer

An upskilling-linked certification initiative designed to recognize talent in generative AI and large language models

Join thousands of members and receive all benefits.

Become Our Member

We offer both Individual & Institutional Membership.