Live import
Live Loader imports data into a running Dgraph cluster using the dgraph live command. Unlike Bulk Loader, Live Loader can import data into an existing database with prior data and supports upserts for updating existing nodes.
Use Live Loader when:
- Importing data into a running cluster
- Updating or adding data to existing nodes
- Loading smaller datasets (for large initial loads, consider Bulk Loader)
Prerequisites
Before importing, ensure you have:
- A running Dgraph cluster
- Data files in RDF (
.rdf,.rdf.gz) or JSON (.json,.json.gz) format - A schema file (optional but recommended)
Live Loader accepts RDF N-Quad/Triple data or JSON in plain or gzipped format. See data migration for converting other formats.
Quick Start
dgraph live --files ./data.rdf.gz --schema ./schema.txt --alpha localhost:9080
Basic Usage
- Local
- Docker
dgraph live \
--files <path-to-data> \
--schema <path-to-schema> \
--alpha localhost:9080
docker run -it --rm -v <local-path-to-data>:/tmp dgraph/dgraph:latest \
dgraph live \
--files /tmp/<data-file> \
--schema /tmp/<schema-file> \
--alpha <dgraph-alpha-address>:9080
Key options:
--alpha— Dgraph Alpha gRPC endpoint (default:localhost:9080). Specify multiple addresses (comma-separated) to distribute load.--files— Path to data file or directory. When a directory is specified, all.rdf,.rdf.gz,.json, and.json.gzfiles are loaded.--schema— Path to schema file (use a different extension like.txtor.schema).
Upserts: Update Existing Data
Live Loader can update existing nodes using upserts. Use one of these approaches:
Using --upsertPredicate
Specify a predicate that serves as a unique identifier:
dgraph live \
--files ./data.rdf.gz \
--schema ./schema.txt \
--upsertPredicate xid
The upsert predicate must exist in the schema and be indexed.
If you are using xid as the upsert predicate name, make sure your schema contains:
<xid>: string @index(exact) @upsert .
Example: If your data contains:
<urn:uuid:550e8400-e29b-41d4-a716-446655440000> <http://xmlns.com/foaf/0.1/name> "Alice Smith" .
This creates or updates the node where xid = "urn:uuid:550e8400-e29b-41d4-a716-446655440000>" and sets its predicate http://xmlns.com/foaf/0.1/name to "Alice Smith".
Using --xidmap
Store UID mappings in a local directory for consistent imports:
dgraph live \
--files ./data.rdf.gz \
--schema ./schema.txt \
--xidmap ./xid-directory
Live Loader looks up existing UIDs or stores new mappings in this directory.
Loading from Cloud Storage
Amazon S3
Set credentials via environment variables or use IAM roles:
| Environment Variable | Description |
|---|---|
AWS_ACCESS_KEY_ID | AWS access key with S3 read permissions |
AWS_SECRET_ACCESS_KEY | AWS secret key |
# Short form (note triple slash)
dgraph live \
--files s3:///<bucket>/<path> \
--schema s3:///<bucket>/<path>/schema.txt
# Long form
dgraph live \
--files s3://s3.<region>.amazonaws.com/<bucket>/<path> \
--schema s3://s3.<region>.amazonaws.com/<bucket>/<path>/schema.txt
IAM Setup
Instead of credentials, configure IAM:
- Create an IAM Role with S3 access
- Attach it using:
- Instance Profile for EC2
- IAM roles for service accounts for EKS
MinIO
| Environment Variable | Description |
|---|---|
MINIO_ACCESS_KEY | MinIO access key |
MINIO_SECRET_KEY | MinIO secret key |
dgraph live \
--files minio://<server>:<port>/<bucket>/<path> \
--schema minio://<server>:<port>/<bucket>/<path>/schema.txt
Multi-tenancy
When ACL is enabled, provide credentials with --creds. By default, data loads into the user's namespace.
dgraph live \
--files ./data.rdf.gz \
--schema ./schema.txt \
--creds "user=groot;password=password;namespace=0"
Loading into a Specific Namespace
Guardians of the Galaxy can load data into any namespace using --force-namespace:
# Load into namespace 123
dgraph live \
--files ./data.rdf.gz \
--schema ./schema.txt \
--creds "user=groot;password=password;namespace=0" \
--force-namespace 123
To preserve namespaces from export files, use --force-namespace -1:
dgraph live \
--files ./data.rdf.gz \
--schema ./schema.txt \
--creds "user=groot;password=password;namespace=0" \
--force-namespace -1
The target namespace must exist before loading data.
Encrypted Data
To load encrypted export files, provide the decryption key:
# Using key file
dgraph live \
--files ./encrypted-data.rdf.gz \
--schema ./encrypted-schema.txt \
--encryption key-file=./encryption.key
# Using HashiCorp Vault
dgraph live \
--files ./encrypted-data.rdf.gz \
--schema ./encrypted-schema.txt \
--vault addr="http://localhost:8200";enc-field="enc_key";enc-format="raw";path="secret/data/dgraph/alpha";role-id-file="./role_id";secret-id-file="./secret_id"
Encrypted exports can be imported into unencrypted Dgraph instances. The p directory will only be encrypted if the Alpha has encryption enabled.
CLI Options Reference
| Flag | Default | Description |
|---|---|---|
--files, -f | Data file or directory path | |
--schema, -s | Schema file path | |
--alpha, -a | localhost:9080 | Dgraph Alpha gRPC address(es) |
--batch, -b | 1000 | N-Quads per mutation batch |
--conc, -c | 10 | Concurrent requests to Dgraph |
--upsertPredicate, -U | Predicate for upsert matching | |
--xidmap, -x | Directory for UID mappings | |
--new_uids | false | Assign new UIDs instead of preserving existing |
--format | Force format (rdf or json) | |
--use_compression, -C | false | Enable gRPC compression |
--creds | ACL credentials (user=;password=;namespace=) | |
--force-namespace | Load into specific namespace (Guardian only) | |
--encryption | Encryption key file for decryption | |
--vault | Vault configuration for encryption key |
See dgraph live CLI reference for the complete list of options.