Leaks have undoubtedly been one of the biggest problems plaguing piping and cabling systems across industries like electricity and power, building and smart cities, oil and gas, etc. Addressing these leaks in time becomes paramount as failure leads to a complete standstill of the transportation chain. Most AI based leak detection systems have failed to reach the deployment state as these systems are prone to output false positives. It is imperative to observe that these leaks don’t occur every day or in other words they are rare events. But when they do occur, these leaks more often than not go unnoticed. Due to the insufficient number of identified leak points, it becomes difficult to build an AI based model for the same. In an attempt to aid/replace rule-based and physics-based leak detection systems, this paper proposes a novel AI based leak detection solution using reinforcement learning which not only reduces false positives but also extends itself to multi armed bandit-based leak localization. By using this methodology, we model the latent behavior of any piping or cabling systems and provide a Q-learning based shortest path recommendation in order to help the maintenance team reach the leak node in a short amount of time.