Creating a Backup and Restoring a Graph Database Repository with enapso-graphdb-cli

Welcome to the guide on how to back up and restore graph database repositories using the enapso-graphdb-cli tool. This document provides step-by-step instructions for installing the tool and executing backup and restoration operations with popular triplestores like GraphDB, Fuseki, and Stardog. This guide ensures efficient data management and security for your graph databases. Implementing these procedures is crucial for protecting against data loss and ensuring your graph database repository can quickly recover from potential disruptions.

For detailed information, to share feedback, or to contribute, please visit our GitHub repository and check out our package on npm. Sharing and reusing this technical documentation can simplify these processes and help raise awareness about our tools.

Prerequisites

Ensure Node.js is installed on your machine. If not, install it from the Node.js official website. This installation includes npm (Node Package Manager), which manages Node packages.

After installation, verify that Node.js and npm are successfully installed by doing the following:

Open a command prompt or terminal.
Run the command node -v and press Enter. This will display the version of Node.js if it is installed.
Run the command npm -v and press Enter. This will display the version of npm if it is installed.

For using the ENAPSO tools, it's important to have at least Node.js version 10 or higher, as earlier versions might not support some functionalities of the tools.

If you need to update Node.js to the latest version, you can download it from the official Node.js website and install it on your system. It will automatically replace the older version with the new one.

Installation

Install the enapso-graphdb-cli tool globally using npm:

npm install -g @innotrade/enapso-graphdb-cli

Supported Triplestores

The tool supports the following Knowledge Graph Platforms:

GraphDB
Fuseki
Stardog

Backup Process

Use the following script to create a backup of the graph database repository. The below script example is configured for a Fuseki repository but can be adjusted for other supported triplestores by changing the appropriate variables. Below, you'll find detailed command line examples for each triplestore, ensuring you can tailor the backup process to your specific setup.

Command Line Examples for Different Triplestores

Fuseki Example

In Fuseki, the data is returned in application/x-trig format when extracting the repository because we cannot set the format of the data.

enapsogdb export --dburl "http://localhost/fuseki" --repository "Test" --targetfile "fuseki_backup.trig" --triplestore "fuseki"

GraphDB Example

enapsogdb export --dburl "http://localhost/graphdb" --repository "Test" --targetfile "graphdb_backup.trig" --format "application/x-trig" --triplestore "graphdb"

Stardog Example

enapsogdb export --dburl "http://localhost/stardog" --repository "Test" --targetfile "stardog_backup.trig" --format "application/x-trig" --triplestore "stardog"

Create a New Script File

Open a text editor such as Notepad++, or Visual Studio Code.
Copy and paste the script content below.

Script Content

#!/bin/bash
echo "Backup Script for Exporting Ontology from Graph Database Using enapso-graphdb-cli"

# Set Variables
DB_URL="http://localhost/fuseki"
REPOSITORY_NAME="Test"
FORMAT="application/x-trig"
EXPORT_FILE="export.trig"
REPORT_FILE="report.txt"
TRIPLESTORE="fuseki"

# Remove Previous Report File
echo "Removing Previous Report File..."
rm $REPORT_FILE

# Export ontology
enapsogdb export --dburl $DB_URL --repository $REPOSITORY_NAME --targetfile $EXPORT_FILE --triplestore $TRIPLESTORE >> $REPORT_FILE 2>&1

echo "Backup made successfully"

Variables Explanation

DB_URL: URL where the triplestore is running.
REPOSITORY_NAME: Name of the repository.
FORMAT: Data format, For Fuseki, the data is only returned in the application/x-trig format, so specifying the format is not necessary. For other triplestores like GraphDB and Stardog, the recommended format is application/x-trig due to its support of named graphs.
EXPORT_FILE: Path and filename for the backup file.
REPORT_FILE: Path and filename for the report file, which contains the response from the script's execution.
TRIPLESTORE: Type of the triplestore (fuseki, graphdb, stardog).

Save the Script

Save the file with a .sh extension, such as backup_script.sh.

Run the Script

For Linux/MacOS: Make the script executable with chmod +x backup_script.sh and run it by navigating to the directory and typing ./backup_script.sh.
For Windows: Ensure you have a tool like Git Bash, Cygwin, or WSL installed that can run Bash scripts. Navigate to the directory where the script is saved and execute it by typing ./backup_script.sh.

This setup allows you to create consistent and reliable backups of your graph database repositories across different operating systems and triplestore configurations.

Restore Process

The below script example restores a graph database repository from a backup file. It includes an optional step to rebuild the cache on the ENAPSO platform, necessary for those using the ENAPSO together or ENAPSO together Free services. Below, you'll find detailed command line examples for each triplestore, ensuring you can tailor the restore process to your specific setup.

Command Line Examples for Different Triplestores

Fuseki Example

enapsogdb import --dburl "http://localhost/fuseki" --repository "Test" --sourcefile "fuseki_backup.trig" --format "application/x-trig" --triplestore "fuseki"

GraphDB Example

enapsogdb import --dburl "http://localhost/graphdb" --repository "Test" --sourcefile "graphdb_backup.trig" --format "application/x-trig" --triplestore "graphdb"

Stardog Example

enapsogdb import --dburl "http://localhost/stardog" --repository "Test" --sourcefile "stardog_backup.trig" --format "application/x-trig" --triplestore "stardog"

Create a New Script File

Open a text editor such as Notepad++, or Visual Studio Code.
Copy and paste the script content below.

Script Content

#!/bin/bash
echo "Running Script for Restoring Graph Database Repository Using enapso-graphdb-cli Tool"

# Set Variables
DB_URL="http://localhost/fuseki"
REPOSITORY_NAME="Test"
FORMAT="application/x-trig"
SOURCE_FILE="export.trig"
REPORT_FILE="report.txt"
TRIPLESTORE="fuseki"

# Remove Previous Report File
echo "Removing Previous Report File..."
rm $REPORT_FILE

# Import ontology
enapsogdb import --dburl $DB_URL --repository $REPOSITORY_NAME --sourcefile $SOURCE_FILE --format $FORMAT --triplestore $TRIPLESTORE >> $REPORT_FILE 2>&1

# Rebuild cache (if applicable)
curl -X POST http://localhost/enapso-dev/graphdb-management/v1/build-cache >> $REPORT_FILE

echo "Graph Database Repository Successfully Restored"

Variables Explanation

DB_URL: URL where the triplestore is running.
REPOSITORY_NAME: Name of the repository.
FORMAT: Data format of the backup file, which is application/x-trig.
SOURCE_FILE: Path and filename for the backup file.
REPORT_FILE: Path and filename for the report file, which contains the response from the script's execution.
TRIPLESTORE: Type of the triplestore (fuseki, graphdb, stardog).

Save the Script

Save the file with a .sh extension, such as restore_script.sh.

Run the Script

For Linux/MacOS: Make the script executable with chmod +x restore_script.sh and run it by navigating to the directory and typing ./restore_script.sh.
For Windows: Ensure you have a tool like Git Bash, Cygwin, or WSL installed that can run scripts. Navigate to the directory where the script is saved and execute it by typing ./restore_script.sh.

This setup allows you to restore your graph database repositories across different operating systems and triplestore configurations.

Additional Step Explanation

The curl request to rebuild the cache is an optional step, relevant for users whose repositories are hosted on the ENAPSO platform:

curl -X POST http://localhost/enapso-dev/graphdb-management/v1/build-cache >> $REPORT_FILE

This command triggers the ENAPSO service to rebuild its cache using the latest uploaded data. This step ensures that any changes from the restoration process are promptly reflected, enhancing the performance and efficiency of queries against the updated repository.

Include this step if you use the ENAPSO platform because it is necessary for the cache mechanism. When you upload data, you need to create a cache because the information about the class model for auto-generating templates or managing templates is retrieved from the cache, not directly from the graph database repository. If you do not create a cache and upload the ontology, and there is a class for which you want to create an auto CRUD template, you will be unable to create it and got an error message because the cache will not have information about that class.

This enhanced documentation provides clear instructions and additional context for using the enapso-graphdb-cli tool effectively, ensuring users can manage their graph databases with confidence.

Conclusion

Regular backups and effective restoration capabilities are essential for managing graph database repositories securely. Using the enapso-graphdb-cli tool, users can easily safeguard their data and restore it quickly if necessary. It's important to maintain a consistent backup routine and periodically test your restoration process to ensure data integrity and minimize downtime.

For production environments, it is recommended to automate this process using cron jobs or scheduled tasks to ensure backups are performed regularly without manual intervention. Automating the backup process enhances consistency by maintaining a regular backup schedule, improves reliability by reducing the likelihood of human error, and increases efficiency by allowing the backup process to run in the background.

For additional support or details, refer to the enapso-graphdb-cli documentation on npm. This will ensure that your data management processes remain robust and reliable.