Automatic SHACL validation for SKOS vocabs

April 18, 2024 | Steffen RörtgenAdrian Pohl

How to add automatic SHACL validation to your vocabs

If you are using SkoHub to publish your vocabularies, we want to show you an easy solution to add some validation to your vocabulary repository to ensure the quality of your vocabularies. You can also use this approach, if you are not using SkoHub, but still want to ensure some basic constraints that we set up in the skohub.shacl.ttl shape, e.g. that every skos:Concept should have at least one skos:prefLabel (which is not required by the SKOS Reference).

Feel free to study the shape file. The sh:messages should give you a good explanation of what is tested with this shape. The sh:severity attribute shows you if the contraint will give a warning or a violation.

The automatic validation step can easily be set up with the SkoHub SHACL shape in the SkoHub Shapes repository. All you have to do is adding the following GitHub Action config to your vocabularies. In the .github/workflows/main.yaml file, add the following steps:

name: Validate TTL Files

on: [push]

jobs:
  check-for-warnings:
    name: Check for Warnings
    runs-on: ubuntu-latest
    steps:
    - name: Checkout repository
      uses: actions/checkout@v3

    - name: Check for Warnings
      run: |
        curl -s https://raw.githubusercontent.com/skohub-io/shapes/main/scripts/checkForWarning.rq >> checkForWarning.rq
        find . -type f -name '*.ttl' | while read file; do
          # Adjust the file path to remove the './' part
          adjusted_file_path=$(echo "$file" | sed 's|^./||')
          echo "Processing $adjusted_file_path with Docker..."
          docker run --rm -v "$(pwd)/$adjusted_file_path:/rdf/test.ttl" skohub/jena:4.6.1 shacl validate --shapes https://raw.githubusercontent.com/skohub-io/shapes/main/skohub.shacl.ttl --data /rdf/test.ttl >> result.ttl
          validation_result="$(docker run --rm --mount type=bind,source=./checkForWarning.rq,target=/rdf/checkForViolation.rq --mount type=bind,source=./result.ttl,target=/rdf/result.ttl skohub/jena:4.6.1 arq --data /rdf/result.ttl --query /rdf/checkForViolation.rq)"
          echo $validation_result
          lines=$(echo "$validation_result" | wc -l )
          # Correct validation has 4 lines of output
          [[ ${lines} -eq 4 ]] || exit 1
        done

  check-for-errors:
    name: Check for Errors
    runs-on: ubuntu-latest
    steps:
    - name: Checkout repository
      uses: actions/checkout@v3

    - name: Check for Errors
      run: |
        curl -s https://raw.githubusercontent.com/skohub-io/shapes/main/scripts/checkForViolation.rq >> checkForViolation.rq
        find . -type f -name '*.ttl' | while read file; do
          # Adjust the file path to remove the './' part
          adjusted_file_path=$(echo "$file" | sed 's|^./||')
          echo "Processing $adjusted_file_path with Docker..."
          docker run --rm -v "$(pwd)/$adjusted_file_path:/rdf/test.ttl" skohub/jena:4.6.1 shacl validate --shapes https://raw.githubusercontent.com/skohub-io/shapes/main/skohub.shacl.ttl --data /rdf/test.ttl >> result.ttl
          validation_result="$(docker run --rm --mount type=bind,source=./checkForViolation.rq,target=/rdf/checkForViolation.rq --mount type=bind,source=./result.ttl,target=/rdf/result.ttl skohub/jena:4.6.1 arq --data /rdf/result.ttl --query /rdf/checkForViolation.rq)"
          echo $validation_result
          lines=$(echo "$validation_result" | wc -l )
          # Correct validation has 4 lines of output
          [[ ${lines} -eq 4 ]] || exit 1
        done

This will activate checks for violations and warnings of the SkoHub SHACL Shape. These checks will also automatically run in the context of a merge request. Unfortunatley you will also get a red error message for warnings, e.g. if you have no license provided in your vocabulary. This is due to the fact that Actions either complete or fail and we have no way to indicate a “warning” message.

What about other code repositories?

The mentioned approach should be applicable to every code repository supporting CI pipelines. The check might have to be adjusted a bit, but the main part should be easy to copy. If you have a requirement for a specific code repository and need support or have developed something you want to share, feel free to reach out at the metadaten.community forum or open an issue.


A blog for SkoHub. This blog is maintained by the SkoHub Community.