
The goal of this project was to build an app designed to assist parents of babies, toddlers, and young children with sleep-related challenges. The app offers personalized sleeping advice and recommendations through an AI-powered assistant, grounded in academic research and supported by a reasoning engine specializing in pediatric sleep science.
Context
As parents, one of the biggest challenges we’ve faced is the lack of consistent, restful sleep, especially during the early years of our children’s lives. Night wakings, sleep regressions, illness, and other unpredictable changes have often left us exhausted and frustrated.
Currently, we have two children: a baby nearing one year old and a four year old. Since the birth of our second child, the struggle around sleep has intensified. Sudden shifts in sleep patterns often happen without any clear external trigger. We’ve tried to make sense of it through online searches, advice from doctors and our own intuition but we often ended up with more questions than answers.
So I turned to research. I began reviewing research papers on child sleep development to look for patterns, science backed and recommendations. That became the foundation for this project.
Tools and Methodologies
The app was developed using Palantir’s Foundry and AIP Platform, leveraging several core services:
- Pipeline Builder for data ingestion, transformation, LLM integration, chunking and embeddings
- Vertex for Knowledge Graphs
- Ontology Manager for semantic and kinetic modeling of data
- Workshop for building the user-facing application

The Approach and Process
Data Collection
The project began with locating and selecting relevant research papers. I searched academic databases, health journals and health institutions for studies focused on baby, toddler and child sleep development. Papers were chosen based on three key criteria:
- Paper domain: baby/toddler/infant sleep
- Number of citations (to ensure credibility)
- Publication by recognized journals or health associations
Data Ingestion
I ingested a set of research papers (in PDF format) covering topics like sleep development, regressions, sleep environments and contributing factors to healthy sleep.
Data Extraction
Using Pipeline Builder, I extracted text from PDFs first at the page level and then split each article into smaller, structured segments or chunks. Each chunk was tagged with metadata including mediaReference, timestamp, article ID, article name, page number and chunk number to ensure downstream traceability and context preservation.
Data Processing
This stage included multiple key steps:
Chunking
Text was broken down from article → pages → chunks, allowing for more efficient processing while retaining semantic and contextual integrity (also through retaining metadata of each chunk: chunk_id, chunk_number, page_number etc). Chunking is critical for ensuring the quality of the responses returned by the LLM.
Embeddings
Each chunk was then converted into a numerical vector representation of the text preserving its semantic context. These embeddings will be used later in the retrieval step of the application, when the user’s query will be compared to the embedded chunks representing the research papers. Embeddings that share similarity between the user input and research papers will be returned and later compiled into a response.
LLM Integration
I designed a detailed prompt to guide the LLM’s behavior: including what to consider, what to ignore, and how to compile a response. The goal was to summarize each chunk while highlighting key sleep-related entities.
Entity Extraction & Deduplication
Entities related to sleep (e.g., “sleep regression,” “sleep duration,” “number of night wakings”, “infant sleep”) were extracted from the summarized text. Duplicates were removed to improve data quality.
Create Ontology Objects
I modeled the data into two Ontology Objects: Chunks and Entities, and defined the relationships between them. This modeling enabled graph-based exploration and analysis.
Knowledge Graphs
Using Vertex and its built-in algorithms, I was able to link extracted Entities to their corresponding Chunks. The resulting Knowledge Graph allows users to visually explore how different topics relate to each other across the research dataset.

Function Creation
Built a function using AIP logic to perform a semantic search on our chunks.
LLM Response
Use an LLM to provide the user with an appropriate response to their query
Application
The final step brought everything together into a single interactive app built with Palantir’s Workshop Service.
Features:
- LLM Prompt Interface :users ask questions about child sleep
- Chunk List: shows the relevant chunks used to generate the answer
- Live Knowledge Graph: dynamically updates to show related chunks/entities
- LLM Response: delivers a summarized, research-backed answer
Challenges and Solutions
One of the biggest hurdles was finding credible, free-to-access research.
To address this, I prioritized papers with:
- Over 20 citations
- Publications from recognized journals or institutions
This ensured a solid, high-quality dataset to build from.
Results and Next Steps
This project helped deepen my understanding of:
- Palantir’s Platform
- Pipeline Builder
- The Ontology
- LLM integration and prompt design
- Semantic data processing and visualization
Most importantly, it showed me the power of a well-structured ontology: when the relationships, definitions, and properties are clearly defined, LLM outputs become more accurate and meaningful.
Limitations (So Far)
The test responses from the app have been inconsistent not yet meeting the quality I hoped for.
What might need refinement:
- The selection and diversity of research sources
- Prompt design and structure
- The ontology definitions themselves
Comparing answers with their referenced chunks or reviewing them against paper abstracts might reveal where improvements are needed.
Conclusion
This side project was born out of a deeply personal need; to make sense of the chaotic and often overwhelming experience of child sleep. By combining the power of research, semantic modeling, and AI, I’ve built an assistant that doesn’t just regurgitate articles, but synthesizes knowledge into practical, personalized advice.
There is still much to improve, but the foundation is in place and I’m excited to keep building and developing this further.
What’s Next
- Data Enrichment: Ingest more research papers to expand the knowledge base.
- Prompt Refinement: tweak prompt, adjust Entities to improve responses
- Augment Ontology Objects: improving descriptions of our Ontology Objects and their properties could improve the LLM’s understanding of the query and result in better responses
- Use-case testing: Run complex, real-world queries (e.g., “My child is 18 months old, has weaned off milk, and wakes up twice a night. What should I try?”).
- Compare LLMs: Test with different LLMs beyond GPT-4.1 mini to assess variation in output quality.
- Expert Feedback: Invite pediatric sleep researchers to test the app and provide insights on accuracy and usability.
Source of inspiration: https://learn.palantir.com/speedrun-your-e2e-aip-workflow
For more information about how I can help you, visit: https://www.oferkulka.com

