Offline AI with React Native

In a world where internet connectivity is often unreliable (especially while on a plane), and privacy concerns are growing, I set out to build an AI-powered study assistant that works entirely offline. The goal was to explore what can be done with available tools: create a React Native app that could parse PDFs, extract meaningful content, and provide intelligent chat capabilities - all without requiring an internet connection.

The User Experience

I wanted the app to provide a seamless experience in these ways:

Upload PDF: Users can select PDFs from their device
Automatic Processing: The app extracts text and creates chunks automatically
Study Materials: Generates summaries, quizzes, flashcards, and mind maps
AI Chat: Two chat modes - general AI assistant and document-specific tutor
Offline Operation: Everything works without internet connectivity

Architecture Overview

The LearnPDF app follows a simple architecture that combines several currently available technologies:

React Native/Expo for cross-platform mobile development (And why not use JS for everything?)
React Native ExecuTorch for on-device AI model execution (running Llama 3.2 1B model)
MMKV for high-performance local storage
Zustand with Immer for state management
Web based PDF parsing for text extraction
RAG (Retrieval-Augmented Generation) for context-aware responses (using All-MiniLM-L6-v2 model for embeddings)

The Journey: From Parsing to Intelligence

1. PDF Processing Pipeline

The foundation of the app lies in its ability to extract and process PDF content locally. I implemented a custom PDF processing system, this in the end turned out to be the most challenging part because I couldn’t use PDF.js or any other tool based on it in a react native environment (no support for web workers) and I initially didn’t want to use web views :

Custom PDF Parser (`PDFExtractor.ts`)

export class PDFTextExtractor {
  private buffer: Uint8Array;
  private objects: Map<number, PDFObject> = new Map();
  private pages: PDFPage[] = [];

  async extractText(): Promise<string> {
    // the would find cross-reference table to know the address of each object
    this.parseObjects();
    this.findPages();
    return await this.extractTextFromPages();
  }
}

So I set out to learn how to parse pdf and extract texts from it manually.
This first custom parser handles:

PDF object parsing and decompression using fflate
Stream extraction and text command parsing
Page-by-page text extraction
React Native-compatible string conversion

This method worked for simple PDFs with support for limited encodings, but performed poorly on most of my PDFs, so I had to accept that the goal is not to rewrite PDF.js to run on react native without web views but to extract text and such I painfully forced myself to integrate a web view in the project.

WebView-Based Processing (`PDFWebView.tsx`)

So I set up the web view (using react-native-webview) in which I could run PDF.js and it worked amazingly well even for more complex PDFs:

const PDFWebView: React.FC<PDFWebViewProps> = ({
  fileUri,
  fileName,
  onTextExtracted,
}) => {
  const loadPDFInWebView = useCallback(async () => {
    const base64Data = await FileSystem.readAsStringAsync(fileUri, {
      encoding: FileSystem.EncodingType.Base64,
    });

    const message = JSON.stringify({
      type: 'load-pdf',
      pdfData: `data:application/pdf;base64,${base64Data}`,
    });

    webViewRef.current.postMessage(message);
  }, [fileUri]);
};

2. Text Chunking and RAG Implementation

Once text is extracted, it needs to be intelligently chunked for RAG:

const chunkText = (text: string, documentId: string): TextChunk[] => {
// a simple mechanism to split text (can be improved, eg. split by paragraphs,...)
  const sentences = text.split(/[.!?]+/).filter((s) => s.trim().length > 0);
  const chunks: TextChunk[] = [];
  const chunkSize = 400;

  let currentChunk = '';
  let chunkIndex = 0;

  for (const sentence of sentences) {
    if (currentChunk.length + sentence.length + 1 > chunkSize && currentChunk.length > 0) {
      chunks.push({
        id: `${documentId}_chunk_${chunkIndex}`,
        documentId,
        content: currentChunk.trim(),
        metadata: {
          pageNumber: 1,
          section: 'main',
          wordCount: currentChunk.split(/\s+/).length,
        },
      });
      currentChunk = sentence + '. ';
      chunkIndex++;
    } else {
      currentChunk += sentence + '. ';
    }
  }
  return chunks;
};

3. On-Device AI with ExecuTorch

The heart of the offline chat capability is ExecuTorch, which allows running AI models directly on the device:

export const useLLM = ({ preventLoad = false }): CustomLLMType => {
  const controllerInstance = useMemo(
    () => new LLMModule({
      tokenCallback: (newToken: string) => {
        setToken(newToken);
        setResponse((prevResponse) => prevResponse + newToken);
      },
      messageHistoryCallback: setMessageHistory,
    }),
    [tokenCallback]
  );

  useEffect(() => {
    if (preventLoad) return;

    (async () => {
      try {
        await controllerInstance.load(LLAMA3_2_1B, setDownloadProgress);
        setIsReady(true);
      } catch (e) {
        setError(e);
      }
    })();
  }, [controllerInstance, preventLoad]);
};

4. State Management with Zustand

The app uses Zustand with Immer for predictable state management, and persisted to device storage using MMKV:

export const useAppStore = create<AppState>()(
  zustandPersist(
    immer((set, get) => ({
      documents: {
        items: [] as PDFDocument[],
        selectedId: null,
        searchQuery: '',
        sortBy: 'date',
        sortOrder: 'desc',
      },
      // ... other state slices
      actions: createActions(set, get),
    })),
    {
      name: 'learnpdf-store',
      storage: createMMKVStorage(),
    }
  )
);

5.The Chat Experience

General AI Chat (`LLMChatScreen.tsx`)

The general chat interface provides a clean, WhatsApp-like experience:

import { useLLM, LLAMA3_2_1B } from 'react-native-executorch';

// in a the component
const llm = useLLM({ model: LLAMA3_2_1B });

const handleSendMessage = async () => {
  if (!inputText.trim() || !llm?.isReady || llm?.isGenerating) return;

  try {
    Keyboard.dismiss();
    await llm?.sendMessage(inputText.trim());
    setInputText('');
  } catch (error) {
    Alert.alert('Error', 'Failed to send message. Please try again.');
  }
};

Context-Aware Tutor Chat (`TutorChatScreen.tsx`)

The tutor chat provides document-specific assistance:

const handleSendMessage = async () => {
  const systemPrompt = `You are a helpful AI tutor assistant. Your role is to help students understand and learn from their PDF document content.`;

  const contextMessages = chatState.messages.slice(-5);
  const messages = [
    { role: 'system' as const, content: systemPrompt },
    { role: 'system' as const, content: `Document content for reference:\n\n${document.extractedText?.slice(0, 2000) || 'No document content available'}` },
    ...contextMessages.map((msg) => ({
      role: msg.role === 'user' ? ('user' as const) : ('assistant' as const),
      content: msg.content,
    })),
    { role: 'user' as const, content: userMessage.content },
  ];

  const response = await llm.generate(messages);
};

6. AI-Powered Content Generation

Beyond chat capabilities, the app generates various study materials using AI. The content generation system follows a consistent pattern across different tools while maintaining type safety and a simple error handling.

Summary Generation (`SummaryScreen.tsx`)

The summary generation provides concise document overviews:

const handleGenerateSummary = async (): Promise<void> => {
  if (!document || !llm?.isReady) return;

  setIsGeneratingSummary(true);
  setSummaryError(null);

  try {
    const systemPrompt = `You are a helpful assistant that generates concise summaries of documents. 
    Summarize the key points, main concepts, and important information from the provided text. Keep the summary focused and relevant.`;

    const messages: Message[] = [
      {
        role: 'system' as const,
        content: systemPrompt,
      },
      {
        role: 'user' as const,
        content: `Please summarize the following document:
        \n\n${document.extractedText || ''}`,
      },
    ];

    const summary: string = await llm.generate(messages);
    const updatedDocument: PDFDocument = { ...document, summary };

    setDocument(updatedDocument);
    StorageService.getInstance().updateDocument(updatedDocument);
  } catch (error) {
    console.error('Error generating summary:', error);
    setSummaryError('Failed to generate summary. Please try again.');
  } finally {
    setIsGeneratingSummary(false);
  }
};

Quiz Generation (`QuizScreen.tsx`)

The quiz system generates interactive questions with multiple formats, this is a perfect example of structured output being used:

interface QuizState {
  questions: Question[];
  currentQuestionIndex: number;
  selectedAnswers: (number | string | null)[];
  showResults: boolean;
  score: number;
}

const handleGenerateQuiz = async (): Promise<void> => {
  if (!document || !llm?.isReady) return;

  setIsGeneratingQuiz(true);
  setQuizError(null);

  try {
    const systemPrompt = `You are a helpful assistant that generates educational quiz questions. 
    Generate diverse, challenging questions based on the provided text using ONLY these question types:
    - multiple-choice (with 4 options)
    - true-false (with correctAnswer: 0 for true, 1 for false)
    - short-answer (with correctAnswer as a string)

    IMPORTANT: 
    - Return ONLY a valid JSON array
    - Use ONLY the exact type names: "multiple-choice", "true-false", "short-answer"
    - Do not include any explanatory text, markdown formatting, or additional commentary
    - Just the raw JSON array`;

    const messages: Message[] = [
      {
        role: 'system' as const,
        content: systemPrompt,
      },
      {
        role: 'user' as const,
        content: `Generate 5 quiz questions based on this document text:
        \n\n${document.extractedText || ''}

        Return ONLY a JSON array using ONLY these three question types:
        [{
          "id": "unique_id",
          "type": "multiple-choice",
          "question": "Question text?",
          "options": ["Option A", "Option B", "Option C", "Option D"],
          "correctAnswer": 0,
          "explanation": "Explanation of correct answer",
          "difficulty": "medium"
        }]`,
      },
    ];

    const response: string = await llm.generate(messages);
    const questions: Question[] = parseQuizResponse(response);

    // Save quiz to store
    const quiz: Quiz = {
      documentId: document.id,
      title: `Quiz for ${document.name}`,
      questions,
    };
    saveQuiz(quiz);

    setQuizState({
      questions,
      currentQuestionIndex: 0,
      selectedAnswers: new Array(questions.length).fill(null),
      showResults: false,
      score: 0,
    });
  } catch (error) {
    console.error('Error generating quiz:', error);
    setQuizError('Failed to generate quiz. Please try again.');
  } finally {
    setIsGeneratingQuiz(false);
  }
};

Quiz Response Parsing:

const parseQuizResponse = (response: string): Question[] => {
  let jsonStr: string = response.trim();

  // Handle markdown code blocks
  if (jsonStr.includes('```json')) {
    const jsonMatch = jsonStr.match(/```json\s*([\s\S]*?)\s*```/);
    if (jsonMatch) {
      jsonStr = jsonMatch[1].trim();
    }
  } else if (jsonStr.includes('```')) {
    const codeMatch = jsonStr.match(/```\s*([\s\S]*?)\s*```/);
    if (codeMatch) {
      jsonStr = codeMatch[1].trim();
    }
  }

  // Extract JSON array if not at start
  if (!jsonStr.startsWith('[')) {
    const arrayMatch = jsonStr.match(/\[[\s\S]*\]/);
    if (arrayMatch) {
      jsonStr = arrayMatch[0];
    }
  }

  const questionsData: RawQuestion[] = JSON.parse(jsonStr);
  return questionsData.map((q, index): Question => ({
    id: q.id || `q_${Date.now()}_${index}`,
    type: q.type === 'true/false' ? 'true-false' : q.type || 'multiple-choice',
    question: q.question || `Question ${index + 1} about the document content?`,
    options: q.options || ['Option A', 'Option B', 'Option C', 'Option D'],
    correctAnswer: q.correctAnswer || 0,
    explanation: q.explanation || 'This question was generated based on the document content.',
    difficulty: q.difficulty || 'medium',
  }));
};

Mind Map Generation (`MindMapScreen.tsx`)

The mind map system creates hierarchical visual representations:

interface MindMapState {
  nodes: MindMapNode[];
  expandedNodes: Set<string>;
}

const handleGenerateMindMap = async (): Promise<void> => {
  if (!document || !llm?.isReady) return;

  setIsGeneratingMindMap(true);
  setMindMapError(null);

  try {
    const systemPrompt = `You are a helpful assistant that creates hierarchical mind maps. 
    Extract key concepts and organize them into a logical hierarchy to help visualize the structure of information.
    Always create at least one root node with children to make the mind map interactive.`;

    const messages: Message[] = [
      {
        role: 'system' as const,
        content: systemPrompt,
      },
      {
        role: 'user' as const,
        content: `Create an interactive mind map structure from this document text:
        \n\n${document.extractedText || ''}

        REQUIREMENTS:
        1. Create a hierarchical structure with at least 3 levels
        2. Include a main root node (level 0) with meaningful children
        3. Each parent node should have 2-4 child nodes
        4. Use meaningful, concise labels (max 3-4 words)

        Return ONLY a JSON array with this EXACT structure:
        [
          {
            "id": "root",
            "label": "Main Topic",
            "children": ["topic1", "topic2"],
            "parent": null,
            "level": 0
          }
        ]`,
      },
    ];

    const response: string = await llm.generate(messages);
    const processedNodes: MindMapNode[] = parseMindMapResponse(response);

    // Find root nodes and set up initial expansion
    const rootNodeIds: string[] = processedNodes
      .filter((node: MindMapNode) => !node.parent || node.parent === null)
      .map((node: MindMapNode) => node.id);

    setMindMapState({
      nodes: processedNodes,
      expandedNodes: new Set(rootNodeIds), // Expand all root nodes by default
    });
  } catch (error) {
    console.error('Error generating mind map:', error);
    setMindMapError('Failed to generate mind map. Please try again.');
  } finally {
    setIsGeneratingMindMap(false);
  }
};

Performance Optimizations

Lazy Loading: AI models are only loaded when needed
Chunked Processing: Large PDFs are processed in chunks to prevent UI blocking
Efficient Storage: MMKV provides faster read/write operations than AsyncStorage
Memoization: Used React.memo and useMemo for expensive computations

Lessons Learned

What Worked Well

React Native ExecuTorch Integration: Provided excellent offline AI capabilities
MMKV Storage: Significantly improved performance over AsyncStorage
Webview PDF Parsing: Ensured compatibility with various PDF formats
Zustand State Management: Simplified state management with great performance

Future Enhancements

Vector Embeddings: Implement proper vector embeddings for better RAG
Model Optimization: Quantize newer models for better performance
Multi-Document Support: Allow chat across multiple documents
Fix Cross-Platform Issues: Some native features behave differently on iOS vs Android
Model Size & Memory Constraints: AI models are large and require careful download management, and device capabilities need to be checked in advance to select the appropriate model. Right now the model might make the app crash if there’s not enough available memory on the device.

Conclusion

Building an offline AI chat tool required solving a few challenges in PDF processing, AI model management, and mobile performance optimization. The result is a working study assistant that works entirely offline while providing intelligent, context-aware responses.

React Native Executorch doesn’t currently leverage native GPU acceleration available on the latest iPhones and Android devices, but provides good quality for testing ground and proof of concept. Next step would be to integrate with Apple’s CoreML backend and see how it compares.

The LearnPDF app is available on both iOS App Store and Google Play Store (testers needed). If you want to try it out, but keep in mind that it has not been tested on multiple devices.

Offline AI with React Native

The User Experience

Architecture Overview

The Journey: From Parsing to Intelligence

1. PDF Processing Pipeline

Custom PDF Parser (`PDFExtractor.ts`)

WebView-Based Processing (`PDFWebView.tsx`)

2. Text Chunking and RAG Implementation

3. On-Device AI with ExecuTorch

4. State Management with Zustand

5.The Chat Experience

General AI Chat (`LLMChatScreen.tsx`)

Context-Aware Tutor Chat (`TutorChatScreen.tsx`)

6. AI-Powered Content Generation

Summary Generation (`SummaryScreen.tsx`)

Quiz Generation (`QuizScreen.tsx`)

Mind Map Generation (`MindMapScreen.tsx`)

Performance Optimizations

Lessons Learned

What Worked Well

Future Enhancements

Conclusion

Comments

More from this blog

Teaching a Small LLM to Design Electronic Circuits: Fine-Tuning Qwen3-4B on 100K KiCad Netlists

Declarative vs Imperative APIs in JavaScript (and Why Chainable Code Feels So Different)

Using Raspberry Pi Pico W to send data via Bluetooth to a SwiftUI app

Animating map region with Animated API

Command Palette

The User Experience

Architecture Overview

The Journey: From Parsing to Intelligence

1. PDF Processing Pipeline

Custom PDF Parser (PDFExtractor.ts)

WebView-Based Processing (PDFWebView.tsx)

2. Text Chunking and RAG Implementation

3. On-Device AI with ExecuTorch

4. State Management with Zustand

5.The Chat Experience

General AI Chat (LLMChatScreen.tsx)

Context-Aware Tutor Chat (TutorChatScreen.tsx)

6. AI-Powered Content Generation

Summary Generation (SummaryScreen.tsx)

Quiz Generation (QuizScreen.tsx)

Mind Map Generation (MindMapScreen.tsx)

Performance Optimizations

Lessons Learned

What Worked Well

Future Enhancements

Conclusion

Comments

More from this blog

Custom PDF Parser (`PDFExtractor.ts`)

WebView-Based Processing (`PDFWebView.tsx`)

General AI Chat (`LLMChatScreen.tsx`)

Context-Aware Tutor Chat (`TutorChatScreen.tsx`)

Summary Generation (`SummaryScreen.tsx`)

Quiz Generation (`QuizScreen.tsx`)

Mind Map Generation (`MindMapScreen.tsx`)