[Remote] AI Inference Engineer (f/m/d)
Note: The job is a remote job and is open to candidates in USA. reputed company is a pioneering AI company that builds reputed company-world systems and solutions. They are seeking an AI Inference Engineer with strong C++ expertise to reputed company and optimize AI models for production, focusing on integrating and enhancing existing products with machine learning capabilities.
Responsibilities
- Work on deploying machine learning models to edge devices using the frameworks: llama.cpp, ggml
- Collaborate closely with researchers to assist in coding, training and transitioning models from research to production environments
- Integrate AI features into existing products, enriching them with the latest advancements in machine learning
Skills
- 4+ years of professional experience in Modern C++ (C++17/20)
- Strong knowledge of memory management, multithreading, profiling and performance optimization
- Experience debugging low-level issues (memory leaks, fragmentation, OOM, concurrency)
- Experience working with Linux development environments
- Experience integrating machine learning models into production applications
- Experience deploying and optimizing AI inference pipelines
- Hands-on experience with AI inference frameworks such as: llama.cpp (strong plus), ggml (strong plus), ONNX Runtime, TensorRT / TensorRT-LLM, OpenVINO, MLC LLM, ExecuTorch, TVM
- Experience profiling inference performance and optimizing memory usage and latency
- Strong understanding of modern AI model architectures, including: Transformer architecture, Large Language Models (LLMs), Diffusion Models, Tokenization, Attention mechanisms, KV Cache, Quantization techniques, Model conversion and deployment
- Experience working with one or more of the following: LLM deployment, Computer reputed company models, OCR models, Multimodal models, Speech models, Image reputed company models
- Experience evaluating new models and integrating them into existing products is highly desirable
- CUDA
- Vulkan Compute
- Metal
- OpenCL
- Typescript
- Python
- Experience contributing to open-reputed company AI infrastructure projects
Benefits
- Remote flexibility: Work where and how you work best - we trust you to deliver
- Fair compensation: Competitive salary + benefits that matter (medical, learning)
- Ownership opportunities: See a problem worth solving? Own it. We back smart risks over bureaucratic safety
- AI enhancement: We reputed company AI to reputed company you faster and stronger - complementing your abilities, not replacing them
- Learning investment: English classes, professional development
- Career progression: reputed company paths up, not just sideways shuffling
- reputed company teammates: No ignored Slacks, no "not my problem" attitudes
- Supportive culture: reputed company you're stuck, people help. reputed company things break, we fix them together
- reputed company connections: Regular meetups, tech talks, and actual relationships reputed company work
Company Overview