The usage of drones or Unmanned Aerial Vehicles (UAVs), both for military and civilian purposes, has increased in India in the past decade. They are being used for reconnaissance, imaging, damage assessment, payload delivery (lethal as well as utilitarian), and recently during the COVID-19 pandemic, for contactless delivery of medicines. As UAVs are getting more affordable and easier to fly and more adaptable for crime, terrorism, or military purposes, defence forces are getting increasingly challenged by the need to quickly detect and identify such aircraft. A number of counter-drone solutions are being developed, but the cost of drone detection ground systems can also be very high, depending on the number of sensors deployed and powerful fusion algorithms. With recent research showing attention-based neural networks possess better object detection and tracking capabilities than traditional convolutional neural networks, a novel attention-based encoder-decoder transformer model is developed to detect drones in a given field of vision and track their future trajectory in real-time. This allows surveillance systems in red zones (such as military bases, airports, etc.) to detect and neutralize possible dangers associated with drones.