SCORM Intelligence FAQs – Filtered

FAQs

Can we see the end-to-end solution? Please see our full video walkthrough here: https://share.descript.com/view/JJsmESBXpuo — this covers the complete pipeline from upload through to skills tagging, AI-assisted search, and the learner response layer.

Can you walk us through parsing of various other formats like PDF, and share a roadmap and any early findings? The Q2 release will add native parsing for PDF, video, audio, and PPT/PPTX. None of these present significant technical challenges — robust Python libraries already handle each format well, so extending coverage is straightforward. If you'd like to send sample content in any of these formats rather than video, we'll run it through the pipeline and include the output in your preview.

Parsing seems easier said than done. PPTs are unstructured data and require a lot of context to process correctly, versus simple text in a PDF. Are there consolidated learnings about content formats and their implications for content processing? The concern about PPT being unstructured doesn't apply here: PPTX is an XML-based format and is therefore highly structured and machine-readable. Our approach takes a screenshot of each slide and uses computer vision to cross-validate the text extracted from the XML — this is current best practice and our tests confirm it works reliably across all real-world PPT styles. Please do send your sample files and we'll demonstrate directly against your content.

In addition to the content processing side, could you share how answers are served to learners? The learner response layer is best seen in action. This demonstration of the extracted SCORM in Teams provides a sense of how that works: https://share.descript.com/view/nExbJ759WHb

We'd like to see the end response against a learner's query — for example, simple text, text plus links, or text plus a marker for further learning. Are there implications for each type, for example does text plus further learning require a very different content processing method? See video above. However, output format is entirely controlled by how the LLM consuming our API is prompted, which gives you full flexibility. Our standard pattern is a narrative response accompanied by direct links to the relevant source content for further learning. The only variable across response types is LLM token cost — longer outputs cost proportionally more — but there is no difference in the underlying content processing method regardless of output format.

Are you using the SCORM internal bookmarking to separate out sections within an asset? We don't rely on SCORM's internal bookmarking flags, which are inconsistently implemented across authoring tools. Instead, we infer section boundaries from the course structure itself using the reasoning capability of the LLM. This is a contextually intelligent interpretation rather than a hard rule, and it produces accurate segmentation across a wide range of content architectures.

With the demo, it looks like the tool is breaking the SCORM package into sections and showing metadata. How much of that metadata is extracted directly from the asset and how much is generated? We prioritise transcribed data throughout — this is the highest-value signal for search and skills tagging. The LLM is deployed as a reasoning tool to accurately locate and surface real transcript sources; we never surface generated transcript data or treat it as authoritative. All module titles are transcribed. We do generate metadata and descriptions to support the analysis layer, but generated content is not surfaced to the user at all. We also use AI to infer module length and tag the content which adds metadata without generating new written content.

If metadata is generated, what are you using to create it — an open model or a proprietary one? Tags are produced based on Filtered’s proprietary machine learning algorithm for search and indexing. To generate content and reason, we use a recent open Anthropic model running on self-hosted infrastructure. The model weights run on our own Amazon servers, meaning no data is transmitted to Anthropic or any third-party AI provider.

AIt doesn't appear that the output is structured in a way an LLM could consume directly. Is that correct? Yes, fully LLM-consumable. Our Teams integration demo and the in-Filtered AI-assisted Search both take the extracted SCORM data and feed it directly to an LLM to generate answers in real time. You can also download the full dataset as a CSV and integrate it into any RAG architecture via vector embeddings — or, where infrastructure supports it, use it to fine-tune a custom model at the weight level.

What taxonomy are you using for skills and topics? Would we have control over it, or is it a shared or fixed taxonomy? The taxonomy is entirely custom to your organisation. You define your own skills, skill labels, proficiency levels, and job roles — aligned to your existing competency framework or built from scratch. You can upload your framework directly into Filtered, or we can generate a recommended scale as a starting point for you to refine.

What languages do you cover? We are using a model that processes input and generates output in most world languages that use standard Unicode characters. Performance varies by language, with particularly strong capabilities in widely-spoken languages. But even in languages with fewer digital resources, we maintains meaningful capabilities.

Do you have new questions? Contact success@filtered.com