You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Maximiliano VillanuevaMV

Maximiliano Villanueva

Senior AI Engineer | LLM Systems, RAG & Production

315 €/día
Barcelona, ES
8-15 años

Tiempo medio de respuesta: 1h

Acerca de Maximiliano

AI Engineer with 10+ years of experience in software engineering and 4+ years focused on building production AI systems.

I specialize in designing and deploying efficient and secure AI solutions, including LLM-based applications, RAG systems, autonomous agents, fine-tuning workflows, and MLOps pipelines.

I have worked as both a consultant and technical lead in early-stage startups, helping teams design AI-driven products from scratch, integrate AI into existing workflows, and scale systems used in production.

Key achievements:

- Reduced AI operational costs by up to 70% by optimizing inference pipelines, model usage, and infrastructure design.
- Designed and deployed multiple AI agents (internal and customer-facing) handling thousands of daily requests with ~95% task success rate.
- Led technical AI strategy and architecture decisions to align product development with business objectives.
  • Español

    Bilingüe o nativo

  • Catalán

    Bilingüe o nativo

  • Inglés

    Competencia profesional completa

Solo teletrabajo
Lleva a cabo sus proyectos principalmente en remoto

Experiencia

  • Deepdots
    AI Tech Lead
    AGENCIAS DE SUBCONTRATACIÓN
    enero de 2025 - Hoy (1 año y 5 meses)
    Copenhagen, Dinamarca
    • • Led the full lifecycle of production LLM systems (Danish↔English translation, summarization and information extraction), from business definition to deployment: self-hosted open-source models on GCP Cloud Run on a single NVIDIA L4 GPU (24 GB VRAM), serving ~15K requests/day at ~1s latency per request.
    • • Cut inference costs by 70% by migrating from third-party APIs to self-hosted open-source models, selecting and sizing candidates (Qwen3 14B, Mistral Small) based on multilingual quality and VRAM footprint.
    • • Built an observability and cost-control layer with LiteLLM as a unified gateway: per-request logging, token/latency/spend tracking, fallbacks and weekly cost reporting per product line.
    • • Designed RAG systems and agentic workflows that turned customer pain-points into production features, improving retention (from non-returning users to recurring usage every 1–2 days).
    • • Defined the AI strategy and owned the technical leadership of development, aligning model and infrastructure decisions with commercial objectives.
    Tech stack: Python, Gemini, OpenAI, Google ADK, LangChain, LiteLLM, vector databases, FastAPI, GCP Cloud Run, Docker, open-source models (Qwen, Mistral).
    Python Google cloud artificial intelligence MLOps
  • Saber
    Machine Learning Engineer
    AGENCIAS DE SUBCONTRATACIÓN
    enero de 2024 - enero de 2025 (1 año)
    Amsterdam, Países Bajos
    • • Sole ML decision-maker in a startup environment: designed the platform's RAG architecture (naive, dense and hybrid retrieval strategies) and the agentic workflows with LLM orchestration.
    • • Optimized RAG pipeline consumption, reducing cost and tokens per query through retrieval and context-management improvements.
    • • Built a daily-signals feature generated from data collected each day, orchestrating collection, processing and automated delivery.
    • • Reported directly to the CEO, acting as technical advisor on the AI product roadmap and feasibility assessments.
    Tech stack: TypeScript, Node.js, Python, OpenAI, GCP, MongoDB.
    Typescript artificial intelligence Google cloud MongoDB Back-End development
  • Sciling
    Machine Learning Engineer & Python Developer
    HIGH TECH
    enero de 2022 - enero de 2024 (2 años)
    Valencia, España
    • • Developed NLP pipelines based on embeddings and classifiers (some trained and deployed by me) for information extraction in a regulated medical domain.
    • • Built RAG systems achieving recall@K >90% in a specific domain, combining dense and hybrid retrieval.
    • • Implemented knowledge graphs for expert-system information extraction.
    • • Developed a conversational chatbot in the medical domain (IBM Watson Assistant + Speech-to-Text).
    • • Fine-tuned open-source models for specific use cases; backend services with Python, FastAPI and Docker.
    Docker RAG Machine learning MLOps Back-End development

Recomendaciones

Sé el primero en recomendar a Maximiliano

Ayuda a este freelance a destacar compartiendo tu experiencia.

Estos perfiles de freelance también coinciden con tus criterios

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Formación

  • MSc
    Universidad Europea
    2025
    MSc
  • MLOps Specialization
    DeepLearning.AI
    2023
    MLOps Specialization

Conjunto de habilidades profesionales

Categorías