Skip to main content

Command Palette

Search for a command to run...

Offline AI with React Native

Combining local LLMs and RAG to help study PDFs content on your smartphone using React Native

Updated
10 min read
A

Building stuff

In a world where internet connectivity is often unreliable (especially while on a plane), and privacy concerns are growing, I set out to build an AI-powered study assistant that works entirely offline. The goal was to explore what can be done with available tools: create a React Native app that could parse PDFs, extract meaningful content, and provide intelligent chat capabilities - all without requiring an internet connection.

The User Experience

I wanted the app to provide a seamless experience in these ways:

  1. Upload PDF: Users can select PDFs from their device

  2. Automatic Processing: The app extracts text and creates chunks automatically

  3. Study Materials: Generates summaries, quizzes, flashcards, and mind maps

  4. AI Chat: Two chat modes - general AI assistant and document-specific tutor

  5. Offline Operation: Everything works without internet connectivity

Architecture Overview

The LearnPDF app follows a simple architecture that combines several currently available technologies:

  • React Native/Expo for cross-platform mobile development (And why not use JS for everything?)

  • React Native ExecuTorch for on-device AI model execution (running Llama 3.2 1B model)

  • MMKV for high-performance local storage

  • Zustand with Immer for state management

  • Web based PDF parsing for text extraction

  • RAG (Retrieval-Augmented Generation) for context-aware responses (using All-MiniLM-L6-v2 model for embeddings)

The Journey: From Parsing to Intelligence

1. PDF Processing Pipeline

The foundation of the app lies in its ability to extract and process PDF content locally. I implemented a custom PDF processing system, this in the end turned out to be the most challenging part because I couldn’t use PDF.js or any other tool based on it in a react native environment (no support for web workers) and I initially didn’t want to use web views :

Custom PDF Parser (PDFExtractor.ts)

export class PDFTextExtractor {
  private buffer: Uint8Array;
  private objects: Map<number, PDFObject> = new Map();
  private pages: PDFPage[] = [];

  async extractText(): Promise<string> {
    // the would find cross-reference table to know the address of each object
    this.parseObjects();
    this.findPages();
    return await this.extractTextFromPages();
  }
}

So I set out to learn how to parse pdf and extract texts from it manually.
This first custom parser handles:

  • PDF object parsing and decompression using fflate

  • Stream extraction and text command parsing

  • Page-by-page text extraction

  • React Native-compatible string conversion

This method worked for simple PDFs with support for limited encodings, but performed poorly on most of my PDFs, so I had to accept that the goal is not to rewrite PDF.js to run on react native without web views but to extract text and such I painfully forced myself to integrate a web view in the project.

WebView-Based Processing (PDFWebView.tsx)

So I set up the web view (using react-native-webview) in which I could run PDF.js and it worked amazingly well even for more complex PDFs:

const PDFWebView: React.FC<PDFWebViewProps> = ({
  fileUri,
  fileName,
  onTextExtracted,
}) => {
  const loadPDFInWebView = useCallback(async () => {
    const base64Data = await FileSystem.readAsStringAsync(fileUri, {
      encoding: FileSystem.EncodingType.Base64,
    });

    const message = JSON.stringify({
      type: 'load-pdf',
      pdfData: `data:application/pdf;base64,${base64Data}`,
    });

    webViewRef.current.postMessage(message);
  }, [fileUri]);
};

2. Text Chunking and RAG Implementation

Once text is extracted, it needs to be intelligently chunked for RAG:

const chunkText = (text: string, documentId: string): TextChunk[] => {
// a simple mechanism to split text (can be improved, eg. split by paragraphs,...)
  const sentences = text.split(/[.!?]+/).filter((s) => s.trim().length > 0);
  const chunks: TextChunk[] = [];
  const chunkSize = 400;

  let currentChunk = '';
  let chunkIndex = 0;

  for (const sentence of sentences) {
    if (currentChunk.length + sentence.length + 1 > chunkSize && currentChunk.length > 0) {
      chunks.push({
        id: `${documentId}_chunk_${chunkIndex}`,
        documentId,
        content: currentChunk.trim(),
        metadata: {
          pageNumber: 1,
          section: 'main',
          wordCount: currentChunk.split(/\s+/).length,
        },
      });
      currentChunk = sentence + '. ';
      chunkIndex++;
    } else {
      currentChunk += sentence + '. ';
    }
  }
  return chunks;
};

3. On-Device AI with ExecuTorch

The heart of the offline chat capability is ExecuTorch, which allows running AI models directly on the device:

export const useLLM = ({ preventLoad = false }): CustomLLMType => {
  const controllerInstance = useMemo(
    () => new LLMModule({
      tokenCallback: (newToken: string) => {
        setToken(newToken);
        setResponse((prevResponse) => prevResponse + newToken);
      },
      messageHistoryCallback: setMessageHistory,
    }),
    [tokenCallback]
  );

  useEffect(() => {
    if (preventLoad) return;

    (async () => {
      try {
        await controllerInstance.load(LLAMA3_2_1B, setDownloadProgress);
        setIsReady(true);
      } catch (e) {
        setError(e);
      }
    })();
  }, [controllerInstance, preventLoad]);
};

4. State Management with Zustand

The app uses Zustand with Immer for predictable state management, and persisted to device storage using MMKV:

export const useAppStore = create<AppState>()(
  zustandPersist(
    immer((set, get) => ({
      documents: {
        items: [] as PDFDocument[],
        selectedId: null,
        searchQuery: '',
        sortBy: 'date',
        sortOrder: 'desc',
      },
      // ... other state slices
      actions: createActions(set, get),
    })),
    {
      name: 'learnpdf-store',
      storage: createMMKVStorage(),
    }
  )
);

5.The Chat Experience

General AI Chat (LLMChatScreen.tsx)

The general chat interface provides a clean, WhatsApp-like experience:

import { useLLM, LLAMA3_2_1B } from 'react-native-executorch';

// in a the component
const llm = useLLM({ model: LLAMA3_2_1B });

const handleSendMessage = async () => {
  if (!inputText.trim() || !llm?.isReady || llm?.isGenerating) return;

  try {
    Keyboard.dismiss();
    await llm?.sendMessage(inputText.trim());
    setInputText('');
  } catch (error) {
    Alert.alert('Error', 'Failed to send message. Please try again.');
  }
};

Context-Aware Tutor Chat (TutorChatScreen.tsx)

The tutor chat provides document-specific assistance:

const handleSendMessage = async () => {
  const systemPrompt = `You are a helpful AI tutor assistant. Your role is to help students understand and learn from their PDF document content.`;

  const contextMessages = chatState.messages.slice(-5);
  const messages = [
    { role: 'system' as const, content: systemPrompt },
    { role: 'system' as const, content: `Document content for reference:\n\n${document.extractedText?.slice(0, 2000) || 'No document content available'}` },
    ...contextMessages.map((msg) => ({
      role: msg.role === 'user' ? ('user' as const) : ('assistant' as const),
      content: msg.content,
    })),
    { role: 'user' as const, content: userMessage.content },
  ];

  const response = await llm.generate(messages);
};

6. AI-Powered Content Generation

Beyond chat capabilities, the app generates various study materials using AI. The content generation system follows a consistent pattern across different tools while maintaining type safety and a simple error handling.

Summary Generation (SummaryScreen.tsx)

The summary generation provides concise document overviews:

const handleGenerateSummary = async (): Promise<void> => {
  if (!document || !llm?.isReady) return;

  setIsGeneratingSummary(true);
  setSummaryError(null);

  try {
    const systemPrompt = `You are a helpful assistant that generates concise summaries of documents. 
    Summarize the key points, main concepts, and important information from the provided text. Keep the summary focused and relevant.`;

    const messages: Message[] = [
      {
        role: 'system' as const,
        content: systemPrompt,
      },
      {
        role: 'user' as const,
        content: `Please summarize the following document:
        \n\n${document.extractedText || ''}`,
      },
    ];

    const summary: string = await llm.generate(messages);
    const updatedDocument: PDFDocument = { ...document, summary };

    setDocument(updatedDocument);
    StorageService.getInstance().updateDocument(updatedDocument);
  } catch (error) {
    console.error('Error generating summary:', error);
    setSummaryError('Failed to generate summary. Please try again.');
  } finally {
    setIsGeneratingSummary(false);
  }
};

Quiz Generation (QuizScreen.tsx)

The quiz system generates interactive questions with multiple formats, this is a perfect example of structured output being used:

interface QuizState {
  questions: Question[];
  currentQuestionIndex: number;
  selectedAnswers: (number | string | null)[];
  showResults: boolean;
  score: number;
}

const handleGenerateQuiz = async (): Promise<void> => {
  if (!document || !llm?.isReady) return;

  setIsGeneratingQuiz(true);
  setQuizError(null);

  try {
    const systemPrompt = `You are a helpful assistant that generates educational quiz questions. 
    Generate diverse, challenging questions based on the provided text using ONLY these question types:
    - multiple-choice (with 4 options)
    - true-false (with correctAnswer: 0 for true, 1 for false)
    - short-answer (with correctAnswer as a string)

    IMPORTANT: 
    - Return ONLY a valid JSON array
    - Use ONLY the exact type names: "multiple-choice", "true-false", "short-answer"
    - Do not include any explanatory text, markdown formatting, or additional commentary
    - Just the raw JSON array`;

    const messages: Message[] = [
      {
        role: 'system' as const,
        content: systemPrompt,
      },
      {
        role: 'user' as const,
        content: `Generate 5 quiz questions based on this document text:
        \n\n${document.extractedText || ''}

        Return ONLY a JSON array using ONLY these three question types:
        [{
          "id": "unique_id",
          "type": "multiple-choice",
          "question": "Question text?",
          "options": ["Option A", "Option B", "Option C", "Option D"],
          "correctAnswer": 0,
          "explanation": "Explanation of correct answer",
          "difficulty": "medium"
        }]`,
      },
    ];

    const response: string = await llm.generate(messages);
    const questions: Question[] = parseQuizResponse(response);

    // Save quiz to store
    const quiz: Quiz = {
      documentId: document.id,
      title: `Quiz for ${document.name}`,
      questions,
    };
    saveQuiz(quiz);

    setQuizState({
      questions,
      currentQuestionIndex: 0,
      selectedAnswers: new Array(questions.length).fill(null),
      showResults: false,
      score: 0,
    });
  } catch (error) {
    console.error('Error generating quiz:', error);
    setQuizError('Failed to generate quiz. Please try again.');
  } finally {
    setIsGeneratingQuiz(false);
  }
};

Quiz Response Parsing:

const parseQuizResponse = (response: string): Question[] => {
  let jsonStr: string = response.trim();

  // Handle markdown code blocks
  if (jsonStr.includes('```json')) {
    const jsonMatch = jsonStr.match(/```json\s*([\s\S]*?)\s*```/);
    if (jsonMatch) {
      jsonStr = jsonMatch[1].trim();
    }
  } else if (jsonStr.includes('```')) {
    const codeMatch = jsonStr.match(/```\s*([\s\S]*?)\s*```/);
    if (codeMatch) {
      jsonStr = codeMatch[1].trim();
    }
  }

  // Extract JSON array if not at start
  if (!jsonStr.startsWith('[')) {
    const arrayMatch = jsonStr.match(/\[[\s\S]*\]/);
    if (arrayMatch) {
      jsonStr = arrayMatch[0];
    }
  }

  const questionsData: RawQuestion[] = JSON.parse(jsonStr);
  return questionsData.map((q, index): Question => ({
    id: q.id || `q_${Date.now()}_${index}`,
    type: q.type === 'true/false' ? 'true-false' : q.type || 'multiple-choice',
    question: q.question || `Question ${index + 1} about the document content?`,
    options: q.options || ['Option A', 'Option B', 'Option C', 'Option D'],
    correctAnswer: q.correctAnswer || 0,
    explanation: q.explanation || 'This question was generated based on the document content.',
    difficulty: q.difficulty || 'medium',
  }));
};

Mind Map Generation (MindMapScreen.tsx)

The mind map system creates hierarchical visual representations:

interface MindMapState {
  nodes: MindMapNode[];
  expandedNodes: Set<string>;
}

const handleGenerateMindMap = async (): Promise<void> => {
  if (!document || !llm?.isReady) return;

  setIsGeneratingMindMap(true);
  setMindMapError(null);

  try {
    const systemPrompt = `You are a helpful assistant that creates hierarchical mind maps. 
    Extract key concepts and organize them into a logical hierarchy to help visualize the structure of information.
    Always create at least one root node with children to make the mind map interactive.`;

    const messages: Message[] = [
      {
        role: 'system' as const,
        content: systemPrompt,
      },
      {
        role: 'user' as const,
        content: `Create an interactive mind map structure from this document text:
        \n\n${document.extractedText || ''}

        REQUIREMENTS:
        1. Create a hierarchical structure with at least 3 levels
        2. Include a main root node (level 0) with meaningful children
        3. Each parent node should have 2-4 child nodes
        4. Use meaningful, concise labels (max 3-4 words)

        Return ONLY a JSON array with this EXACT structure:
        [
          {
            "id": "root",
            "label": "Main Topic",
            "children": ["topic1", "topic2"],
            "parent": null,
            "level": 0
          }
        ]`,
      },
    ];

    const response: string = await llm.generate(messages);
    const processedNodes: MindMapNode[] = parseMindMapResponse(response);

    // Find root nodes and set up initial expansion
    const rootNodeIds: string[] = processedNodes
      .filter((node: MindMapNode) => !node.parent || node.parent === null)
      .map((node: MindMapNode) => node.id);

    setMindMapState({
      nodes: processedNodes,
      expandedNodes: new Set(rootNodeIds), // Expand all root nodes by default
    });
  } catch (error) {
    console.error('Error generating mind map:', error);
    setMindMapError('Failed to generate mind map. Please try again.');
  } finally {
    setIsGeneratingMindMap(false);
  }
};

Performance Optimizations

  1. Lazy Loading: AI models are only loaded when needed

  2. Chunked Processing: Large PDFs are processed in chunks to prevent UI blocking

  3. Efficient Storage: MMKV provides faster read/write operations than AsyncStorage

  4. Memoization: Used React.memo and useMemo for expensive computations

Lessons Learned

What Worked Well

  1. React Native ExecuTorch Integration: Provided excellent offline AI capabilities

  2. MMKV Storage: Significantly improved performance over AsyncStorage

  3. Webview PDF Parsing: Ensured compatibility with various PDF formats

  4. Zustand State Management: Simplified state management with great performance

Future Enhancements

  1. Vector Embeddings: Implement proper vector embeddings for better RAG

  2. Model Optimization: Quantize newer models for better performance

  3. Multi-Document Support: Allow chat across multiple documents

  4. Fix Cross-Platform Issues: Some native features behave differently on iOS vs Android

  5. Model Size & Memory Constraints: AI models are large and require careful download management, and device capabilities need to be checked in advance to select the appropriate model. Right now the model might make the app crash if there’s not enough available memory on the device.

Conclusion

Building an offline AI chat tool required solving a few challenges in PDF processing, AI model management, and mobile performance optimization. The result is a working study assistant that works entirely offline while providing intelligent, context-aware responses.

React Native Executorch doesn’t currently leverage native GPU acceleration available on the latest iPhones and Android devices, but provides good quality for testing ground and proof of concept. Next step would be to integrate with Apple’s CoreML backend and see how it compares.

The LearnPDF app is available on both iOS App Store and Google Play Store (testers needed). If you want to try it out, but keep in mind that it has not been tested on multiple devices.