Could we make a method to sanitize PDF’s that preserves the metadata?
It would be better to strip active content like javascript and actions, without flattening the PDF and losing all the text data having the original text is better than sending it through ocr again.
It would be better to strip active content like javascript and actions, without flattening the PDF and losing all the text data having the original text is better than sending it through ocr again.