CV - Lila Boualili

Experience

Data Scientist	Oct. 2023 – present
Mindflow	Paris, France
Designed and deployed a production-grade multi-agent orchestration engine for workflow automation, enabling natural-language interaction with 4000+ API integrations and custom agents. [Patent Pending] Mitigated operational risk with a security-first execution layer featuring credential-scoped tool execution, human-in-the-loop (HITL) approval gates for high-stakes API actions, and full execution traceability. Engineered a hierarchical orchestration engine with long-term memory and RAG-based tool discovery, enabling the system to handle complex, multi-turn tasks while minimizing context bloat. Developed data augmentation pipelines for API documentation and tool metadata, generating provider-specific agent skills and improving tool retrieval quality. Designed an evaluation framework to benchmark LLM planning and tool-use capabilities across standard and edge-case scenarios, enabling regression testing and model drift monitoring over time. STACK: AWS Cloud, Amazon Bedrock, OpenAI, Gemini, Mistral, LiteLLM, RAG, multi-agent orchestration.

Data Scientist

Oct. 2023 – present

Mindflow

Paris, France

Designed and deployed a production-grade multi-agent orchestration engine for workflow automation, enabling natural-language interaction with 4000+ API integrations and custom agents. [Patent Pending]
Mitigated operational risk with a security-first execution layer featuring credential-scoped tool execution, human-in-the-loop (HITL) approval gates for high-stakes API actions, and full execution traceability.
Engineered a hierarchical orchestration engine with long-term memory and RAG-based tool discovery, enabling the system to handle complex, multi-turn tasks while minimizing context bloat.
Developed data augmentation pipelines for API documentation and tool metadata, generating provider-specific agent skills and improving tool retrieval quality.
Designed an evaluation framework to benchmark LLM planning and tool-use capabilities across standard and edge-case scenarios, enabling regression testing and model drift monitoring over time.

STACK: AWS Cloud, Amazon Bedrock, OpenAI, Gemini, Mistral, LiteLLM, RAG, multi-agent orchestration.

Postdoc Research Fellowship	Dec. 2022 – Sept. 2023
LIG, University of Grenoble Alps	Grenoble, France
Investigated systematic compositional generalization in transformer-based seq2seq models by incorporating syntactic structure into the decoding process through hyperbolic representations of dependency trees. Designed, trained, and evaluated a hybrid Euclidean–Hyperbolic transformer architecture for structure-aware sequence generation. STACK: Transformers, Fairseq, PyTorch, Geoopt.

Postdoc Research Fellowship

Dec. 2022 – Sept. 2023

LIG, University of Grenoble Alps

Grenoble, France

Investigated systematic compositional generalization in transformer-based seq2seq models by incorporating syntactic structure into the decoding process through hyperbolic representations of dependency trees.
Designed, trained, and evaluated a hybrid Euclidean–Hyperbolic transformer architecture for structure-aware sequence generation.

STACK: Transformers, Fairseq, PyTorch, Geoopt.

Research internships

A Study of Term-Topic Embeddings	June 2021 – Feb. 2022
Max Planck Institute for Informatics (MPI)	Saarbrücken, Germany
Advisor Andrew Yates Studied advancements with the ColBERT architecture, which relies on token-level representations with late interactions for document ranking. Proposed a structured distillation approach for ColBERT contextualized token embeddings using aggregated frozen pre-trained term-topic embeddings representing token-level contextual semantics.

Research on Microblog Retrieval and Summarization	Dec. 2018 – July 2019
Institut de Recherche en Informatique de Toulouse (IRIT)	Toulouse, France
Advisor Mohand Boughanem Developed a tweet summarization pipeline leveraging deep learning models for semantic tweet representation and relevance ranking with respect to user interests.

Education

Ph.D in Computer Science	2019-2022
IRIT Laboratory, University of Paul Sabatier Toulouse	Toulouse, France
- Thesis Topic Studying Relevant Signals for Document Retrieval using Transformer Models
- Highlights Enhancing, fine-tuning, and evaluating encoder models for ad hoc retrieval based on cross-encoder and dual-encoder architectures, using single-vector and multi-vector representations, using direct supervision or distillation from teacher models.
- Area of study Deep Learning, Information Retrieval, and Natural Language Processing

Master's degree in Computer Science and Engineering	2018-2019
Higher National School of Computer Science (ESI)	Algiers, Algeria
- Area of study Deep Learning, Information Retrieval, and Natural Language Processing

Engineering degree in Computer Science and Engineering (Valedictorian)	2014-2019
Higher National School of Computer Science (ESI)	Algiers, Algeria
Majored in Information Systems & Software

Projects

DeepResearchPy
A Python package for automated deep query investigation, adapted from JinaAI's node-deepsearch, it iteratively searches, reads, and reasons across the web until finding a satisfactory answer or reaching a token budget limit.

SciWatch
A Python package that delivers scheduled newsletters with relevant scientific papers, using boolean retrieval for query matching.

HybridSeq2Seq
Transformer-based Euclidean-hyperbolic hybrid seq2seq model for COGS semantic parsing, leveraging hyperbolic embeddings to capture hierarchical structures and enhance compositional generalization.

Skills

Programming Python, TypeScript
Libraries Pytorch, Transformers, LangChain, LangGraph
Operating Systems Linux and other UNIX variants, Microsoft Windows
Agile Methodologies Scrum and Kanban
Languages French (Native), English (Full Professional Proficiency)

Publications

MarkedBERT: Integrating Traditional IR Cues in Pre-trained Language Models for Passage Retrieval

Lila Boualili, Jose G. Moreno, and Mohand Boughanem. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '20), Xi'an, China, July 25–30, 2020

Highlighting exact matching via marking strategies for ad hoc document ranking with pretrained contextualized language models

Lila Boualili, Jose G. Moreno, and Mohand Boughanem. Information Retrieval Journal 25 (4), 414-460

Deep learning for information retrieval: studying relevant signals for ad hoc search based on transformer models

Thesis, Université Paul Sabatier-Toulouse III

A Study of Term-Topic Embeddings for Ranking

Lila Boualili and Andrew Yates. Advances in Information Retrieval: 45th European Conference on Information Retrieval, ECIR 2023, Dublin, Ireland, April 2–6, 2023, Proceedings, Part II

Teaching

Data Structures and Fundamental Algorithms	Nov. 2019 – 2021
University of Toulouse III - Paul Sabatier	Toulouse, France
Undergraduate course (Semester I), taught in French.

Database Programming and administration	Mar. 2020 – 2021
University of Toulouse III - Paul Sabatier	Toulouse, France
Undergraduate course (Semester II), taught in French.

Hobbies

Reading and gaming (with a strong interest in horror)
Drawing and photography
Bouldering