Introduction

If you work on TensorFlow and want to share GPU with multiple processes then you must have encountered into either of one of the below situations. This post discusses how to address these situations and use the GPU resources optimally to take maximum advantage of it.

Technical Details:

  • GPU — Nvidia RTX 3080
  • CPU & Memory — Intel i7, 32 GB RAM
  • TensorFlow Version — 2.4
  • CUDA Version — 11.2
  • Application — TensorFlow application that I will be discussing is Automated Number Plate Recognition which is built on Darknet as backbone network

Let’s see Memory Allocation For a TensorFlow-Based Model…


If you work on Tensorflow and want to share GPU with multiple processes then you must have encountered into either of one situation:

Let’s see Memory Allocation For Tensorflow Based Model on A GPU:

This is the GPU memory details before loading any Tensorflow Based Workload.

It can be clearly observed that GPU has 10 GB of memory and of which only 489 MB is occupied.


Einstein Summation Convention on Operands

What is Einstein Summation Convention ?

Einstein summation is a convention for simplifying expression that includes summation of vectors, matrices or in general tensor. Remember scalar is zero rank tensor, vector is a rank one tensor and matrices are rank two tensors. Basically scalar, vectors and matrices are different forms of tensor based upon their rank.

There are three rules which need to be followed to represent an expression as Einstien Summation and they are:

  1. Values along the repeated indices (axis) are multiplied and then implicitly sum over. (if an index (axis) in expression is repeated it will be implicitly summed over i.e. …


Typical Situation

Introduction

Containerization is buzz word. Everyone talks about docker and containerization. Everyone want their project’s to be containerized because of associated benefit. But challenge is there are very few who understands actually what containers are and how they can be used in Artificial Intelligence based projects.

I have attended many sessions about Docker but I couldn’t understand much and more importantly how it’s going to help in my role as Data Scientist.

Objective of this blog is to cover the information that is required to know about containers, what containers are, why they are useful and how to use them as…


figure — 1

Problem

We all encounter above situation when inference or prediction time of our Machine Learning model is high. This specially happens for complicated ensembles models like Random Forest, Gradient Boosting etc. Further prediction time for the model increases with the number of features so larger models have high inference time.

This overall results in poor response time of API serving a model or long duration batch cycles. We strive for improving performance by using techniques like scaling-up server, use load balancer, run multiple models in parallel etc.

In the worst case scenario, We look for training a new model that…


Problem Introduction

Data Exploration is the very first and fundamental task that Data Scientist’s perform as soon as they receive the data.

Often Data Exploration even in a basic sense takes a lot of time. Though some of the metrices which Data Scientists want to take a look are common for various tasks but they usually don’t have a single code base to run such tasks. And if not every time but most of the time they need to re-write the code, fix the error etc. This results in lot of time.

There are various reasons for doing data exploration:

  1. Understanding distribution…


We often want to experiment with Spark but gets stuck with the absence of Spark Environment. In this post we will discuss how to setup a Spark environment inside the google colab with the few line of codes and We can use spark right away there in few minutes.

Below are the steps for installing Spark inside the google colab:

  1. Pre-requisite for Spark is installing Java. We need to have Java installed before setting-up Spark in colab.

!apt-get install openjdk-8-jdk-headless

2. We will see a message once Java is installed. We can check Java version.

!java — version


Many of us are unaware of a relationship between Cosine Similarity and Euclidean Distance. Knowing this relationship is extremely helpful if we need to use them interchangeably in an indirect manner. One application of this concept is converting your Kmean Clustering Algorithm to Spherical KMeans Clustering algorithm where we can use cosine similarity as a measure to cluster data.

Use Case:-

We often want to cluster text documents to discover certain patterns. K-Means clustering is a natural first choice for clustering use case. K-Means implementation of scikit learn uses “Euclidean Distance” to cluster similar data points.

It is also well known that…

Tanveer Khan

Sr. Data Scientist with strong hands-on experience in building Real World Artificial Intelligence Based Solutions using NLP, Computer Vision and Edge Devices.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store