GK Question

technology medium true_false

Quantization reduces LLM model size and inference latency by using lower-precision numbers.

  1. True
  2. False

Answer: True

Quantization (INT8, INT4) reduces model size 2-4x with minimal accuracy loss, enabling deployment on edge devices and reducing cloud inference costs. Critical for scalable LLM serving.

Topic Advanced AI/ML
Exam Relevance UPSC, Banking, SSC