Table of Contents
Deprecated
This feature is replace by https://github.com/seaweedfs/seaweedfs/wiki/Filer-Active-Active-cross-cluster-continuous-synchronization
Architecture
Requirements
There should be two Seaweed File Systems running, possibly across data centers. Each should have its filer, master, and volume servers.
Configuration
- Configure notification. use "
weed scaffold -config=notification
" to see the notification section.
[notification.log]
# this is only for debugging perpose and does not work with "weed filer.replicate"
enabled = false
[notification.kafka]
enabled = false
hosts = [
"localhost:9092"
]
topic = "seaweedfs_filer"
offsetFile = "./last.offset"
offsetSaveIntervalSeconds = 10
[notification.aws_sqs]
# experimental, let me know if it works
enabled = false
aws_access_key_id = "" # if empty, loads from the shared credentials file (~/.aws/credentials).
aws_secret_access_key = "" # if empty, loads from the shared credentials file (~/.aws/credentials).
region = "us-east-2"
sqs_queue_name = "my_filer_queue" # an existing queue name
[notification.google_pub_sub]
# read credentials doc at https://cloud.google.com/docs/authentication/getting-started
enabled = false
google_application_credentials = "/path/to/x.json" # path to json credential file
project_id = "" # an existing project id
topic = "seaweedfs_filer_topic" # a topic, auto created if does not exists
[notification.gocdk_pub_sub]
# The Go Cloud Development Kit (https://gocloud.dev).
# PubSub API (https://godoc.org/gocloud.dev/pubsub).
# Supports AWS SNS/SQS, Azure Service Bus, Google PubSub, NATS and RabbitMQ.
enabled = false
# This URL will Dial the RabbitMQ server at the URL in the environment
# variable RABBIT_SERVER_URL and open the exchange "myexchange".
# The exchange must have already been created by some other means, like
# the RabbitMQ management plugin.
topic_url = "rabbit://myexchange"
sub_url = "rabbit://myqueue"
-
Setup Kafka. Possibly you need to create the Kafka topic if auto topic creation is not enabled.
-
Configure replication. use "
weed scaffold -config=replication
" to see the source and sink sections.
[source.filer]
enabled = true
grpcAddress = "localhost:18888"
directory = "/buckets" # all files under this directory tree are replicated
[notification.kafka]
enabled = true
hosts = [
"localhost:9092"
]
topic = "seaweedfs_filer1_to_filer2"
[sink.filer]
enabled = true
grpcAddress = "localhost:18888"
directory = "/backup" # all replicated files are under this directory tree
replication = ""
collection = ""
ttlSec = 0
Please read the source.filer and sink.filer configurations carefully and make sure you fully understand its intention. This is actually using one filer and replicate files from one folder to another folder.
- Start the Kafka.
- Start the replication. "
weed filer.replicate
" - Start the filer. "
weed filer
"
Replicate existing files
The weed filer.replicate
only replicates files that have been updated. If there are already a lot of files, very likely you would need to copy them too.
To do this, use echo 'fs.meta.notify' | weed shell
, which will iterate all files and generate one "file creation" event for each file to the message queue. Thus, all files will be replicated.
Introduction
API
Configuration
- Replication
- Store file with a Time To Live
- Failover Master Server
- Erasure coding for warm storage
- Server Startup Setup
- Environment Variables
Filer
- Filer Setup
- Directories and Files
- Data Structure for Large Files
- Filer Data Encryption
- Filer Commands and Operations
- Filer JWT Use
Filer Stores
- Filer Cassandra Setup
- Filer Redis Setup
- Super Large Directories
- Path-Specific Filer Store
- Choosing a Filer Store
- Customize Filer Store
Advanced Filer Configurations
- Migrate to Filer Store
- Add New Filer Store
- Filer Store Replication
- Filer Active Active cross cluster continuous synchronization
- Filer as a Key-Large-Value Store
- Path Specific Configuration
- Filer Change Data Capture
FUSE Mount
WebDAV
Cloud Drive
- Cloud Drive Benefits
- Cloud Drive Architecture
- Configure Remote Storage
- Mount Remote Storage
- Cache Remote Storage
- Cloud Drive Quick Setup
- Gateway to Remote Object Storage
AWS S3 API
- Amazon S3 API
- AWS CLI with SeaweedFS
- s3cmd with SeaweedFS
- rclone with SeaweedFS
- restic with SeaweedFS
- nodejs with Seaweed S3
- S3 API Benchmark
- S3 API FAQ
- S3 Bucket Quota
- S3 API Audit log
- S3 Nginx Proxy
- Docker Compose for S3
AWS IAM
Machine Learning
HDFS
- Hadoop Compatible File System
- run Spark on SeaweedFS
- run HBase on SeaweedFS
- run Presto on SeaweedFS
- Hadoop Benchmark
- HDFS via S3 connector
Replication and Backup
- Async Replication to another Filer [Deprecated]
- Async Backup
- Async Filer Metadata Backup
- Async Replication to Cloud [Deprecated]
- Kubernetes Backups and Recovery with K8up
Messaging
Use Cases
Operations
Advanced
- Large File Handling
- Optimization
- Volume Management
- Tiered Storage
- Cloud Tier
- Cloud Monitoring
- Load Command Line Options from a file
- SRV Service Discovery
- Volume Files Structure