Position Details
Step into a key role as an AI Model Compression Engineer at I Machines, Inc., where you will work on innovative compression techniques for AI models. Engage in optimizing model performance across multiple architectures with hybrid methods.
You will be part of a dynamic software team focused on AI/ML, tasked with developing and executing comprehensive model compression pipelines. This includes utilizing sensitivity analysis and designing quantization strategies tailored for transformer models.
Key Responsibilities:
• Own the complete compression pipeline development
• Conduct baseline assessments and benchmarking
• Implement layerwise sensitivity analysis frameworks
• Design and apply quantization strategies effectively
• Integrate pruning techniques and hybrid methods to optimize models
Requirements:
• Bachelor’s degree in related engineering or science field
• Minimum 1 year experience with TensorFlow, JAX, or similar
• In-depth knowledge of quantizati...