(NYC, Full-Time)
About the Role
We're seeking a leading machine learning engineer who can architect breakthrough systems while staying immersed in cutting-edge research. In direct collaboration with the CEO, you'll shape the future of AI at Plastic--tackling challenges across the entire machine learning stack.
This role demands someone who thrives at the intersection of research and engineering; someone who can read and reproduce state-of-the-art papers, design novel architectures, and transform promising experiments into production-ready systems. You'll move fluidly between theoretical frameworks and practical implementation, making intuitive decisions about model architecture, quantization strategies, and serving infrastructure.
We need a technical polymath who excels across the ML stack: from designing systematic experiments and running rigorous evaluations to building robust data pipelines and scalable model serving/inference systems. You should be particularly adept with post-training techniques that are redefining the field--from advanced inference-time computation methods to reinforcement learning with reasoning models.
The LLM space moves at lightning speed, and so do we. You'll prototype rapidly while maintaining research rigor, implement robust MLOps practices, and craft observable systems that scale. Our small, interdisciplinary team moves fast--high agency is core to who we are. You'll have the freedom to directly impact our research and products to push the boundaries of what's possible in AI.
We're building systems that haven't been built before, solving problems that haven't been solved. If you're a technical leader who thrives on these challenges and can serve as our ML north star, we want you on our team.
About You
- 3+ years of applied ML experience with deep LLM expertise
- High cultural alignment with Plastic Labs' ethos
- NYC-based or open to NYC relocation
- Strong command of a popular Python ML library (e.g PyTorch, TF, JAX, HF transformers, etc)
- Experience replicating research papers soon after publication
- Experience building and scaling robust inference systems
- Practical experience with post-training and inference-time techniques (RL a plus)
- Ability to build reliable MLOps pipelines that perform under load
- Proficiency with Unix environments and developer tools (Git, Docker, etc.)
- Up-to-date with the Open Source AI community and emerging technologies
- Self-directed with a bias toward rapid execution
- Driven to push language models beyond conventional boundaries
- Background in cognitive sciences (CS, linguistics, neuroscience, philosophy, psychology, etc...) or related fields a plus
Research We're Excited About
s1: Simple test-time scaling
Neural Networks Are Elastic Origami!
Titans: Learning to Memorize at Test Time
Mind Your Theory: Theory of Mind Goes Deeper Than Reasoning
Generative Agent Simulations of 1,000 People
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains
Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm
Theory of Mind May Have Spontaneously Emerged in Large Language Models
Think Twice: Perspective-Taking Improved Large Language Models' Theory-of-Mind Capabilities
Representation Engineering: A Top-Down Approach to AI Transparency
Theia Vogel's post on Representation Engineering Mistral 7B an Acid Trip
A Roadmap to Pluralistic Alignment
Open-Endedness is Essential for Artificial Superhuman Intelligence
Simulators
Extended Mind Transformers
Violation of Expectation via Metacognitive Prompting Reduces Theory of Mind Prediction Error in Large Language Models
Constitutional AI: Harmlessness from AI Feedback
Claude's Character
Language Models Represent Space and Time
Generative Agents: Interactive Simulacra of Human Behavior
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
Cyborgism
Spontaneous Reward Hacking in Iterative Self-Refinement
... accompanying twitter thread
(Back to Working at Plastic)