This position has been filled. However, we're always looking to meet great candidates. If you like what's listed here, please reach out regardless -- we are growing fast and might have similar positions in the future.
Check out our other open positions here.
(NYC / Remote, Full-Time)
About the Role
We’re searching for a machine learning research engineer excited to work on problems critical to the next version of Honcho. In large part, this means exploring novel ways to create and leverage high-fidelity user representations.
This role requires subversive tactics for working with LLMs—approaching from a representation level, top down thinking, emergent abilities, meta-methods, etc. There’s a massive under-explored gulf between deterministic, structured LLM use cases and xenocognitive psychoanalysis. Within it lies enormous opportunity to investigate human-LLM coherence and the personalized, positive sum experiences we can build on that substrate.
We think the lack of study here is heavily to blame for the divorce of ML evals and product metrics. This leads to poor application retention, trust violations, high churn, and avoiding (abandoning?) much of the space of possible AI apps. Especially those characterized by personalization, user coherence, individual alignment, and social acuity.
At Plastic, you’ll have the opportunity to work with an interdisciplinary team to discover the best ways to represent human identity, unite ML & product outcomes, train best in class theory of mind models, author open-ended evals & frameworks, publish research, contribute to OS ML, set new user data standards of transparency, and shine a light on overhung capabilities lying latent in large language models.
About You
- 2-3 years NLP ML experience or equivalent
- High cultural alignment with Plastic Labs’ ethos
- Primary location +/- 3 hrs of EST
- Up to date on OS AI community & technologies
- Comfortable in Unix environment + attendant command line tools (Git, Docker, etc)
- Experience training, fine-tuning, & evaluating LLM performance
- Experience implementing & maintaining systems based on SOTA research
- Experience or background in alignment & interpretability methods
- Proficiency with some set of popular Python ML libraries (e.g PyTorch, TF, JAX, HF transformers, etc)
- Complementary interest or experience specific to representation engineering, control vectors, prompt optimization, sparse auto-encoders, agentic frameworks, emergent behaviors, theory of mind, identity a plus
- Complementary background in cognitive sciences (cs, linguistics, neuroscience, philosophy, & psychology) or other adjacent interdisciplinary fields a plus
How to Apply
Please send the following to research@plasticlabs.ai:
- Resume/CV in whatever form it exists (PDF, LinkedIn, website, etc)
- Portfolio of notable work (GitHub, pubs, ArXiv, blog, X, etc)
- Statement of alignment specific to Plastic Labs—how do you identify with our mission, how can you contribute, etc? (points for brief, substantive, heterodox)
Applications without these 3 items won’t be considered, but be sure to optimize for speed over perfection. If relevant, be sure to credit the LLM you used.
And it can’t hurt to join Discord and introduce yourself or engage with our GitHub.
Research We’re Excited About
Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm
Theory of Mind May Have Spontaneously Emerged in Large Language Models
Think Twice: Perspective-Taking Improved Large Language Models’ Theory-of-Mind Capabilities
Representation Engineering: A Top-Down Approach to AI Transparency
Theia Vogel’s post on Representation Engineering Mistral 7B an Acid Trip
A Roadmap to Pluralistic Alignment
Open-Endedness is Essential for Artificial Superhuman Intelligence
Simulators
Extended Mind Transformers
Violation of Expectation via Metacognitive Prompting Reduces Theory of Mind Prediction Error in Large Language Models
Constitutional AI: Harmlessness from AI Feedback
Claude’s Character
Language Models Represent Space and Time
Generative Agents: Interactive Simulacra of Human Behavior
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
Cyborgism
Spontaneous Reward Hacking in Iterative Self-Refinement
… accompanying twitter thread
(Back to Work at Plastic)