Sensitive Information Types in Office 365 allow you identify sensitive content that is held in Exchange and SharePoint Online and restrict its use.
You can leverage existing rules (e.g. Credit card numbers) or define your own.
The rules are applied as part of the search crawl. The content of a document or email is analysed and if, for example, a credit card number is found a property is set on the document.
Depending on the licence that you have for Office 365 you can then run searches to identify the offending content or apply policies that restrict its use.
Sensitive information types are not guaranteed to find every offending document but they are a great broad brush approach to information security and compliance.
If you are working with scanned images that have been OCR’d and converted to PDF+Text then there is a good chance that these will be identified.
Further reading;
https://blogs.office.com/en-us/2014/08/27/search-sensitive-content-sharepoint-onedrive-documents/
https://technet.microsoft.com/en-us/library/dn781122(v=exchg.150).aspx