Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
I have been doing some research and I am trying to understand what is the standard way to read a
pptx
with JavaScript/Typescript in the browser.
A lot of the libraries I have found are mainly for node like
textract
. I found one library called
JS-PPTX
but the last commit was made in 2016 so that's not super promising.
Most of the libraries are about creating a
Power Point
presentation, but what I really need to do is be able to read the file and identify the contents of the slides.
I am happy to read the raw file format and try to parse it if that is better, but I just need a way to upload and read the file with the
FileReader Api
.
Or if there is a way to convert the pptx to another format that is easier to read I would be into that. One library I found called
PPTX2HTML
, but this last commit is from 2017.
I found this Stack Overflow
post
, but it is from
2010
so I am hoping there is an evolution of thought.
–
–
–
–
PPTX (
see the spec here
) is a zipped, XML-based file format that is part of the Microsoft Office Open XML (also known as OOXML or OpenXML) specification, introduced as part of Microsoft Office 2007 and later.
Browsers can parse XML, so you probably have to:
read the file with
FileReader
,
unzip it
somehow
parse it with
DOMParser
maybe transform it with XSLT
Thanks for contributing an answer to Stack Overflow!
-
Please be sure to
answer the question
. Provide details and share your research!
But
avoid
…
-
Asking for help, clarification, or responding to other answers.
-
Making statements based on opinion; back them up with references or personal experience.
To learn more, see our
tips on writing great answers
.