Offline AI with React Native
Combining local LLMs and RAG to help study PDFs content on your smartphone using React Native
Building stuff
In a world where internet connectivity is often unreliable (especially while on a plane), and privacy concerns are growing, I set out to build an AI-powered study assistant that works entirely offline. The goal was to explore what can be done with available tools: create a React Native app that could parse PDFs, extract meaningful content, and provide intelligent chat capabilities - all without requiring an internet connection.
The User Experience
I wanted the app to provide a seamless experience in these ways:
Upload PDF: Users can select PDFs from their device
Automatic Processing: The app extracts text and creates chunks automatically
Study Materials: Generates summaries, quizzes, flashcards, and mind maps
AI Chat: Two chat modes - general AI assistant and document-specific tutor
Offline Operation: Everything works without internet connectivity
Architecture Overview
The LearnPDF app follows a simple architecture that combines several currently available technologies:
React Native/Expo for cross-platform mobile development (And why not use JS for everything?)
React Native ExecuTorch for on-device AI model execution (running
Llama 3.2 1Bmodel)MMKV for high-performance local storage
Zustand with Immer for state management
Web based PDF parsing for text extraction
RAG (Retrieval-Augmented Generation) for context-aware responses (using
All-MiniLM-L6-v2model for embeddings)
The Journey: From Parsing to Intelligence
1. PDF Processing Pipeline
The foundation of the app lies in its ability to extract and process PDF content locally. I implemented a custom PDF processing system, this in the end turned out to be the most challenging part because I couldn’t use PDF.js or any other tool based on it in a react native environment (no support for web workers) and I initially didn’t want to use web views :
Custom PDF Parser (PDFExtractor.ts)
export class PDFTextExtractor {
private buffer: Uint8Array;
private objects: Map<number, PDFObject> = new Map();
private pages: PDFPage[] = [];
async extractText(): Promise<string> {
// the would find cross-reference table to know the address of each object
this.parseObjects();
this.findPages();
return await this.extractTextFromPages();
}
}
So I set out to learn how to parse pdf and extract texts from it manually.
This first custom parser handles:
PDF object parsing and decompression using
fflateStream extraction and text command parsing
Page-by-page text extraction
React Native-compatible string conversion
This method worked for simple PDFs with support for limited encodings, but performed poorly on most of my PDFs, so I had to accept that the goal is not to rewrite PDF.js to run on react native without web views but to extract text and such I painfully forced myself to integrate a web view in the project.
WebView-Based Processing (PDFWebView.tsx)
So I set up the web view (using react-native-webview) in which I could run PDF.js and it worked amazingly well even for more complex PDFs:
const PDFWebView: React.FC<PDFWebViewProps> = ({
fileUri,
fileName,
onTextExtracted,
}) => {
const loadPDFInWebView = useCallback(async () => {
const base64Data = await FileSystem.readAsStringAsync(fileUri, {
encoding: FileSystem.EncodingType.Base64,
});
const message = JSON.stringify({
type: 'load-pdf',
pdfData: `data:application/pdf;base64,${base64Data}`,
});
webViewRef.current.postMessage(message);
}, [fileUri]);
};
2. Text Chunking and RAG Implementation
Once text is extracted, it needs to be intelligently chunked for RAG:
const chunkText = (text: string, documentId: string): TextChunk[] => {
// a simple mechanism to split text (can be improved, eg. split by paragraphs,...)
const sentences = text.split(/[.!?]+/).filter((s) => s.trim().length > 0);
const chunks: TextChunk[] = [];
const chunkSize = 400;
let currentChunk = '';
let chunkIndex = 0;
for (const sentence of sentences) {
if (currentChunk.length + sentence.length + 1 > chunkSize && currentChunk.length > 0) {
chunks.push({
id: `${documentId}_chunk_${chunkIndex}`,
documentId,
content: currentChunk.trim(),
metadata: {
pageNumber: 1,
section: 'main',
wordCount: currentChunk.split(/\s+/).length,
},
});
currentChunk = sentence + '. ';
chunkIndex++;
} else {
currentChunk += sentence + '. ';
}
}
return chunks;
};
3. On-Device AI with ExecuTorch
The heart of the offline chat capability is ExecuTorch, which allows running AI models directly on the device:
export const useLLM = ({ preventLoad = false }): CustomLLMType => {
const controllerInstance = useMemo(
() => new LLMModule({
tokenCallback: (newToken: string) => {
setToken(newToken);
setResponse((prevResponse) => prevResponse + newToken);
},
messageHistoryCallback: setMessageHistory,
}),
[tokenCallback]
);
useEffect(() => {
if (preventLoad) return;
(async () => {
try {
await controllerInstance.load(LLAMA3_2_1B, setDownloadProgress);
setIsReady(true);
} catch (e) {
setError(e);
}
})();
}, [controllerInstance, preventLoad]);
};
4. State Management with Zustand
The app uses Zustand with Immer for predictable state management, and persisted to device storage using MMKV:
export const useAppStore = create<AppState>()(
zustandPersist(
immer((set, get) => ({
documents: {
items: [] as PDFDocument[],
selectedId: null,
searchQuery: '',
sortBy: 'date',
sortOrder: 'desc',
},
// ... other state slices
actions: createActions(set, get),
})),
{
name: 'learnpdf-store',
storage: createMMKVStorage(),
}
)
);
5.The Chat Experience
General AI Chat (LLMChatScreen.tsx)
The general chat interface provides a clean, WhatsApp-like experience:
import { useLLM, LLAMA3_2_1B } from 'react-native-executorch';
// in a the component
const llm = useLLM({ model: LLAMA3_2_1B });
const handleSendMessage = async () => {
if (!inputText.trim() || !llm?.isReady || llm?.isGenerating) return;
try {
Keyboard.dismiss();
await llm?.sendMessage(inputText.trim());
setInputText('');
} catch (error) {
Alert.alert('Error', 'Failed to send message. Please try again.');
}
};
Context-Aware Tutor Chat (TutorChatScreen.tsx)
The tutor chat provides document-specific assistance:
const handleSendMessage = async () => {
const systemPrompt = `You are a helpful AI tutor assistant. Your role is to help students understand and learn from their PDF document content.`;
const contextMessages = chatState.messages.slice(-5);
const messages = [
{ role: 'system' as const, content: systemPrompt },
{ role: 'system' as const, content: `Document content for reference:\n\n${document.extractedText?.slice(0, 2000) || 'No document content available'}` },
...contextMessages.map((msg) => ({
role: msg.role === 'user' ? ('user' as const) : ('assistant' as const),
content: msg.content,
})),
{ role: 'user' as const, content: userMessage.content },
];
const response = await llm.generate(messages);
};
6. AI-Powered Content Generation
Beyond chat capabilities, the app generates various study materials using AI. The content generation system follows a consistent pattern across different tools while maintaining type safety and a simple error handling.
Summary Generation (SummaryScreen.tsx)
The summary generation provides concise document overviews:
const handleGenerateSummary = async (): Promise<void> => {
if (!document || !llm?.isReady) return;
setIsGeneratingSummary(true);
setSummaryError(null);
try {
const systemPrompt = `You are a helpful assistant that generates concise summaries of documents.
Summarize the key points, main concepts, and important information from the provided text. Keep the summary focused and relevant.`;
const messages: Message[] = [
{
role: 'system' as const,
content: systemPrompt,
},
{
role: 'user' as const,
content: `Please summarize the following document:
\n\n${document.extractedText || ''}`,
},
];
const summary: string = await llm.generate(messages);
const updatedDocument: PDFDocument = { ...document, summary };
setDocument(updatedDocument);
StorageService.getInstance().updateDocument(updatedDocument);
} catch (error) {
console.error('Error generating summary:', error);
setSummaryError('Failed to generate summary. Please try again.');
} finally {
setIsGeneratingSummary(false);
}
};
Quiz Generation (QuizScreen.tsx)
The quiz system generates interactive questions with multiple formats, this is a perfect example of structured output being used:
interface QuizState {
questions: Question[];
currentQuestionIndex: number;
selectedAnswers: (number | string | null)[];
showResults: boolean;
score: number;
}
const handleGenerateQuiz = async (): Promise<void> => {
if (!document || !llm?.isReady) return;
setIsGeneratingQuiz(true);
setQuizError(null);
try {
const systemPrompt = `You are a helpful assistant that generates educational quiz questions.
Generate diverse, challenging questions based on the provided text using ONLY these question types:
- multiple-choice (with 4 options)
- true-false (with correctAnswer: 0 for true, 1 for false)
- short-answer (with correctAnswer as a string)
IMPORTANT:
- Return ONLY a valid JSON array
- Use ONLY the exact type names: "multiple-choice", "true-false", "short-answer"
- Do not include any explanatory text, markdown formatting, or additional commentary
- Just the raw JSON array`;
const messages: Message[] = [
{
role: 'system' as const,
content: systemPrompt,
},
{
role: 'user' as const,
content: `Generate 5 quiz questions based on this document text:
\n\n${document.extractedText || ''}
Return ONLY a JSON array using ONLY these three question types:
[{
"id": "unique_id",
"type": "multiple-choice",
"question": "Question text?",
"options": ["Option A", "Option B", "Option C", "Option D"],
"correctAnswer": 0,
"explanation": "Explanation of correct answer",
"difficulty": "medium"
}]`,
},
];
const response: string = await llm.generate(messages);
const questions: Question[] = parseQuizResponse(response);
// Save quiz to store
const quiz: Quiz = {
documentId: document.id,
title: `Quiz for ${document.name}`,
questions,
};
saveQuiz(quiz);
setQuizState({
questions,
currentQuestionIndex: 0,
selectedAnswers: new Array(questions.length).fill(null),
showResults: false,
score: 0,
});
} catch (error) {
console.error('Error generating quiz:', error);
setQuizError('Failed to generate quiz. Please try again.');
} finally {
setIsGeneratingQuiz(false);
}
};
Quiz Response Parsing:
const parseQuizResponse = (response: string): Question[] => {
let jsonStr: string = response.trim();
// Handle markdown code blocks
if (jsonStr.includes('```json')) {
const jsonMatch = jsonStr.match(/```json\s*([\s\S]*?)\s*```/);
if (jsonMatch) {
jsonStr = jsonMatch[1].trim();
}
} else if (jsonStr.includes('```')) {
const codeMatch = jsonStr.match(/```\s*([\s\S]*?)\s*```/);
if (codeMatch) {
jsonStr = codeMatch[1].trim();
}
}
// Extract JSON array if not at start
if (!jsonStr.startsWith('[')) {
const arrayMatch = jsonStr.match(/\[[\s\S]*\]/);
if (arrayMatch) {
jsonStr = arrayMatch[0];
}
}
const questionsData: RawQuestion[] = JSON.parse(jsonStr);
return questionsData.map((q, index): Question => ({
id: q.id || `q_${Date.now()}_${index}`,
type: q.type === 'true/false' ? 'true-false' : q.type || 'multiple-choice',
question: q.question || `Question ${index + 1} about the document content?`,
options: q.options || ['Option A', 'Option B', 'Option C', 'Option D'],
correctAnswer: q.correctAnswer || 0,
explanation: q.explanation || 'This question was generated based on the document content.',
difficulty: q.difficulty || 'medium',
}));
};
Mind Map Generation (MindMapScreen.tsx)
The mind map system creates hierarchical visual representations:
interface MindMapState {
nodes: MindMapNode[];
expandedNodes: Set<string>;
}
const handleGenerateMindMap = async (): Promise<void> => {
if (!document || !llm?.isReady) return;
setIsGeneratingMindMap(true);
setMindMapError(null);
try {
const systemPrompt = `You are a helpful assistant that creates hierarchical mind maps.
Extract key concepts and organize them into a logical hierarchy to help visualize the structure of information.
Always create at least one root node with children to make the mind map interactive.`;
const messages: Message[] = [
{
role: 'system' as const,
content: systemPrompt,
},
{
role: 'user' as const,
content: `Create an interactive mind map structure from this document text:
\n\n${document.extractedText || ''}
REQUIREMENTS:
1. Create a hierarchical structure with at least 3 levels
2. Include a main root node (level 0) with meaningful children
3. Each parent node should have 2-4 child nodes
4. Use meaningful, concise labels (max 3-4 words)
Return ONLY a JSON array with this EXACT structure:
[
{
"id": "root",
"label": "Main Topic",
"children": ["topic1", "topic2"],
"parent": null,
"level": 0
}
]`,
},
];
const response: string = await llm.generate(messages);
const processedNodes: MindMapNode[] = parseMindMapResponse(response);
// Find root nodes and set up initial expansion
const rootNodeIds: string[] = processedNodes
.filter((node: MindMapNode) => !node.parent || node.parent === null)
.map((node: MindMapNode) => node.id);
setMindMapState({
nodes: processedNodes,
expandedNodes: new Set(rootNodeIds), // Expand all root nodes by default
});
} catch (error) {
console.error('Error generating mind map:', error);
setMindMapError('Failed to generate mind map. Please try again.');
} finally {
setIsGeneratingMindMap(false);
}
};
Performance Optimizations
Lazy Loading: AI models are only loaded when needed
Chunked Processing: Large PDFs are processed in chunks to prevent UI blocking
Efficient Storage: MMKV provides faster read/write operations than AsyncStorage
Memoization: Used React.memo and useMemo for expensive computations
Lessons Learned
What Worked Well
React Native ExecuTorch Integration: Provided excellent offline AI capabilities
MMKV Storage: Significantly improved performance over AsyncStorage
Webview PDF Parsing: Ensured compatibility with various PDF formats
Zustand State Management: Simplified state management with great performance
Future Enhancements
Vector Embeddings: Implement proper vector embeddings for better RAG
Model Optimization: Quantize newer models for better performance
Multi-Document Support: Allow chat across multiple documents
Fix Cross-Platform Issues: Some native features behave differently on iOS vs Android
Model Size & Memory Constraints: AI models are large and require careful download management, and device capabilities need to be checked in advance to select the appropriate model. Right now the model might make the app crash if there’s not enough available memory on the device.
Conclusion
Building an offline AI chat tool required solving a few challenges in PDF processing, AI model management, and mobile performance optimization. The result is a working study assistant that works entirely offline while providing intelligent, context-aware responses.
React Native Executorch doesn’t currently leverage native GPU acceleration available on the latest iPhones and Android devices, but provides good quality for testing ground and proof of concept. Next step would be to integrate with Apple’s CoreML backend and see how it compares.
The LearnPDF app is available on both iOS App Store and Google Play Store (testers needed). If you want to try it out, but keep in mind that it has not been tested on multiple devices.