Creating a Backup and Restoring a Graph Database Repository with enapso-graphdb-cli
Welcome to the guide on how to back up and restore graph database repositories using the enapso-graphdb-cli
tool. This document provides step-by-step instructions for installing the tool and executing backup and restoration operations with popular triplestores like GraphDB, Fuseki, and Stardog. This guide ensures efficient data management and security for your graph databases. Implementing these procedures is crucial for protecting against data loss and ensuring your graph database repository can quickly recover from potential disruptions.
For detailed information, to share feedback, or to contribute, please visit our GitHub repository and check out our package on npm. Sharing and reusing this technical documentation can simplify these processes and help raise awareness about our tools.
Prerequisites
Ensure Node.js is installed on your machine. If not, install it from the Node.js official website. This installation includes npm (Node Package Manager), which manages Node packages.
After installation, verify that Node.js and npm are successfully installed by doing the following:
Open a command prompt or terminal.
Run the command
node -v
and press Enter. This will display the version of Node.js if it is installed.Node VersionRun the command
npm -v
and press Enter. This will display the version of npm if it is installed.npm version
For using the ENAPSO tools, it's important to have at least Node.js version 10 or higher, as earlier versions might not support some functionalities of the tools.
If you need to update Node.js to the latest version, you can download it from the official Node.js website and install it on your system. It will automatically replace the older version with the new one.
Installation
Install the enapso-graphdb-cli tool globally using npm:
npm install -g @innotrade/enapso-graphdb-cli
Supported Triplestores
The tool supports the following Knowledge Graph Platforms:
GraphDB
Fuseki
Stardog
Backup Process
Use the following script to create a backup of the graph database repository. The below script example is configured for a Fuseki repository but can be adjusted for other supported triplestores by changing the appropriate variables. Below, you'll find detailed command line examples for each triplestore, ensuring you can tailor the backup process to your specific setup.
Command Line Examples for Different Triplestores
Fuseki Example
In Fuseki, the data is returned in application/x-trig
format when extracting the repository because we cannot set the format of the data.
enapsogdb export --dburl "http://localhost/fuseki" --repository "Test" --targetfile "fuseki_backup.trig" --triplestore "fuseki"
GraphDB Example
enapsogdb export --dburl "http://localhost/graphdb" --repository "Test" --targetfile "graphdb_backup.trig" --format "application/x-trig" --triplestore "graphdb"
Stardog Example
Create a New Script File
Open a text editor such as Notepad++, or Visual Studio Code.
Copy and paste the script content below.
Script Content
Variables Explanation
DB_URL
: URL where the triplestore is running.REPOSITORY_NAME
: Name of the repository.FORMAT
: Data format, For Fuseki, the data is only returned in theapplication/x-trig
format, so specifying the format is not necessary. For other triplestores like GraphDB and Stardog, the recommended format isapplication/x-trig
due to its support of named graphs.EXPORT_FILE
: Path and filename for the backup file.REPORT_FILE
: Path and filename for the report file, which contains the response from the script's execution.TRIPLESTORE
: Type of the triplestore (fuseki
,graphdb
,stardog
).CONTEXT
:Context
to be exported. If left empty, the entire repository is exported. If a string is provided, it saves the context's data. If an array of contexts is provided, it creates a zip file containing each context's data.
Context Exporting Details
Without Context:
For Fuseki: Exports the entire repository in
application/x-trig
format.For GraphDB and Stardog: Exports the whole repository in the specified format.
With Context as String:
For Fuseki: Exports the specified context in
text/turtle
format.For GraphDB and Stardog: Exports the specified context in the provided format.
With Context as Array:
For Fuseki: Exports each context in
text/turtle
format and saves them in a zip file.For GraphDB and Stardog: Exports each context in the provided format and saves them in a zip file with each file named its context.
Save the Script
Save the file with a
.sh
extension, such asbackup_script.sh
.
Run the Script
For Linux/MacOS: Make the script executable with
chmod +x backup_script.sh
and run it by navigating to the directory and typing./backup_script.sh
.For Windows: Ensure you have a tool like Git Bash, Cygwin, or WSL installed that can run Bash scripts. Navigate to the directory where the script is saved and execute it by typing
./backup_script.sh
.
This setup allows you to create consistent and reliable backups of your graph database repositories across different operating systems and triplestore configurations.
Restore Process
The below script example restores a graph database repository from a backup file. It includes an optional step to rebuild the cache on the ENAPSO platform, necessary for those using the ENAPSO together or ENAPSO together Free services. Below, you'll find detailed command line examples for each triplestore, ensuring you can tailor the restore process to your specific setup.
Command Line Examples for Different Triplestores
Fuseki Example
GraphDB Example
Stardog Example
Create a New Script File
Open a text editor such as Notepad++, or Visual Studio Code.
Copy and paste the script content below.
Script Content
Variables Explanation
DB_URL
: URL where the triplestore is running.REPOSITORY_NAME
: Name of the repository.FORMAT
: Data format of the backup file, which isapplication/x-trig
.SOURCE_FILE
: Path and filename for the backup file.REPORT_FILE
: Path and filename for the report file, which contains the response from the script's execution.TRIPLESTORE
: Type of the triplestore (fuseki
,graphdb
,stardog
).
Save the Script
Save the file with a
.sh
extension, such asrestore_script.sh
.
Run the Script
For Linux/MacOS: Make the script executable with
chmod +x restore_script.sh
and run it by navigating to the directory and typing./restore_script.sh
.For Windows: Ensure you have a tool like Git Bash, Cygwin, or WSL installed that can run scripts. Navigate to the directory where the script is saved and execute it by typing
./restore_script.sh
.
This setup allows you to restore your graph database repositories across different operating systems and triplestore configurations.
Additional Step Explanation
The curl request to rebuild the cache is an optional step, relevant for users whose repositories are hosted on the ENAPSO platform:
This command triggers the ENAPSO service to rebuild its cache using the latest uploaded data. This step ensures that any changes from the restoration process are promptly reflected, enhancing the performance and efficiency of queries against the updated repository.
Include this step if you use the ENAPSO platform because it is necessary for the cache mechanism. When you upload data, you need to create a cache because the information about the class model for auto-generating templates or managing templates is retrieved from the cache, not directly from the graph database repository. If you do not create a cache and upload the ontology, and there is a class for which you want to create an auto CRUD template, you will be unable to create it and got an error message because the cache will not have information about that class.
Conclusion
Regular backups and effective restoration capabilities are essential for managing graph database repositories securely. Using the enapso-graphdb-cli
tool, users can easily safeguard their data and restore it quickly if necessary. It's important to maintain a consistent backup routine and periodically test your restoration process to ensure data integrity and minimize downtime.
For production environments, it is recommended to automate this process using cron jobs or scheduled tasks to ensure backups are performed regularly without manual intervention. Automating the backup process enhances consistency by maintaining a regular backup schedule, improves reliability by reducing the likelihood of human error, and increases efficiency by allowing the backup process to run in the background.
For additional support or details, refer to the enapso-graphdb-cli documentation on npm. This will ensure that your data management processes remain robust and reliable.
Â
Related pages
(C) Copyright 2014-2024 INNOTRADE GmbH, Herzogenrath, NRW, Germany (all rights reserved)