Skip to main content

PDF Parsing (Extract Questions and Text)

Upload a pre-created assignment document and let our AI automatically pull the questions and text into Formative!

Written by Neta Raz Studnitski

PDF Parsing

Formative offers the ability to use AI to parse uploaded PDFs.

Teachers can upload a PDF file of a worksheet, test, passage, or similar document and get back a structured, editable assessment with questions, answer choices, correct answers, passages, images, aligned standards, answer choice explanations, hints and even math notation already in place.

This can be very helpful when you have pre-made assignments that you want to import into Formative. Eliminating the need to manually enter the text and each of the questions from the PDF into Formative - AI will process the parsing request, pull the content of the PDF into text box and question forms within Formative.

How it works

The parser runs through a multi-step, AI-assisted process:

  1. Upload and convert - The PDF is converted page by page into high-resolution images so the layout, tables, and formatting are preserved.

  2. Extract images - Embedded graphs, diagrams, and figures are pulled out of the PDF and re-attached to the right questions when possible. Decorative images and items like QR codes are filtered out.

  3. Each page is processed by a vision model that reads the content and rebuilds it. Passages stay connected to their questions, and math is preserved.

  4. Pages are combined when it helps keep related content together, like a passage or question that spans a page break, Luna keeps them together,

  5. Generate items - The cleaned rebuild is turned into formative items, including multiple choice, multi-select, short answer, long answer, true/false, and dropdowns. It can also include hints, answer explanations, and suggested standards alignment.

This all runs as a background job with progress updates, so teachers see what’s happening as Luna is working to parse the PDF.

The parser maps content into a fixed set of supported question types, so anything outside multiple choice, multi-select, short answer, long answer, true/false, or dropdown will be approximated to the closest available question type.

Note: AI is not a perfect tool, please review the results of the parsing and edit them as needed.

Step by Step

  1. On your Home page - Click "Create +" and select "Upload Your Content", or if your PDF is hosted in your Google Drive click on "Import with Google"

    alternatively, you can also access any formative, click on the + button to open the 'Add Item' window, and choose "Upload Images or Documents" from the options menu)

  2. Import the PDF by Drag & Drop or file upload (if you used the 'Add Item' window from within the formative you can also pick the PDF directly from Google Drive)

  3. Once the PDF is populating the modal, un-check any pages you do not wish to include in the formative, and then click on "Generate Questions"

  4. Select what additional content, such as Answer Choice Explanations, Hints, or aligned Standards you'd like to generate in your formative by toggling the relevant button on, and then click "Generate Questions" again

  5. Give the system the time it needs to process the request

  6. As the system completes processing, the content of the PDF will populate your formative as text, and the questions extracted directly from the PDF will populate in the right side panel to enhance it.

  7. At this point, you can review and edit the extracted questions and text to your liking. You can remove or add questions and content items, edit questions and answer keys, and apply different question settings.

Did this answer your question?