Skip to content

Statistics on SDS evaluations

A brief summary of our experience with the evaluated safety data sheets

  • by
evaluation of safety data sheet / Sicherheitsdatenblatt Auswertung
margaritaylita/envato

We have processed more than 1 million safety data sheets so far and would like to briefly explain here what we have noticed in the data sheets.

Evaluation of safety data sheets

You can’t improve what you can’t measure. For this reason, we at SdbHub naturally have to make and check statistics on safety data sheets. The falsehoods and errors give us valuable insight into industry- and age-specific idiosyncrasies of SDSs.

  1. OCR – Often asked, rarely relevant: In the context of SdbHub, we very often receive the question whether we are also able to read data from scanned documents. We even have our own OCR technology, which we train specifically for different purposes. The character recognition is adapted for the respective application. However, in the case of SDSs, we have found that a maximum of about 1% of SDSs are even available as images. The older the SDS, the more likely that the data sheet is an image. The majority of SDSs, however, are only available as digital documents.
  2. A SDS for eternity: When looking at the MSDSs, we found that information such as date and version is most often missing. Here it is important to define an individual workflow. Correction must follow standardized procedures.
  3. Hazard pictogram mashup: The processing of SDSs can sometimes be amusing. For example, we find that there are SDSs where the pictograms are twisted, half of them are composed of two pictograms, or they are so small that it is not even recognizable which hazard pictograms are listed there. We have therefore developed quality measures in SdbHub that point out such inconsistencies.
  4. Sections? I don’t really need: Of course, there are also suppliers who make it extremely easy for themselves and simply omit sections or mix information from different sections wildly. This happens less and less often, but the technical automation should be trained for such cases. This is the only way to prevent the extraction of incorrect information.
  5. Oops, where did that text come from? When we encountered this problem with SDSs for the very first time, a solution had to be found quite quickly. Here’s what happened: There is a possibility with PDFs that texts are not visible to humans (because, for example, the text color corresponds to the background color). However, when these are digitally read out, suddenly exactly these texts invisible to the human eye appear. Maybe some have already noticed this when copying. But don’t worry, SdbHub recognizes background texts automatically and gives a corresponding feedback in the returned results.
  6. Bad software or too lazy? Some suppliers make it quite difficult for their customers by listing the H, P or EUH phrases without code. It creates an enormous manual effort to derive these codes from the texts. Again, SdbHub remedies this by outputting the appropriate phrases from the document with the appropriate codes. We can observe the same for the classifications, which are sometimes written down in a very convoluted way. The assignment to the categories is not so clearly readable. We have also provided a remedy for this.

By far not all, but certainly the most important peculiarities are mentioned here.

Thanks to all of you who keep providing us with feedback so that we can continuously optimize the SdbHub to achieve extremely high precision and ease of work.