Position Details
Become a pivotal part of Cerebras Systems as a Full Stack LLM Engineer. Focus on optimizing the performance of large language models on innovative wafer-scale AI architecture.
The Inference Core Model Bringup team needs an experienced engineer who excels in fast-paced environments to support large-scale ML applications. Your expertise will span from model architecture adjustment to runtime integration, ensuring incredible performance and effective debugging across the entire Cerebras software stack.
Key Responsibilities: • Lead the bring up of ML models on Cerebras CSX systems • Conduct performance tuning and optimization across the AI toolchain • Identify and debug issues in model codes, IRs, and hardware utilization • Propose enhancements for better automation in model deployments
Requirements: • Bachelor's, Master's, or PhD in Computer Science or related discipline • Strong familiarity with deep learning frameworks such as TensorFlow • Proven skills in per...