mirror of https://github.com/seaweedfs/seaweedfs.git synced 2024-11-24 02:59:13 +08:00

SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.

blob-storage cloud-drive distributed-file-system distributed-storage distributed-systems erasure-coding fuse hadoop-hdfs hdfs kubernetes object-storage posix replication s3 s3-storage seaweedfs tiered-file-system

Go to file

Chris Lu f0b928ff5e go fmt		2021-10-11 23:23:46 -07:00
.github	fix container building, make it same as rocksdb image	2021-10-11 01:29:05 -07:00
docker	github action build rocksdb image	2021-09-30 21:43:34 -07:00
k8s/helm_charts2	2.71	2021-10-10 22:40:44 -07:00
note	Update SeaweedFS_Gateway_RemoteObjectStore.png	2021-10-06 20:43:02 -07:00
other	java: adjust cache expiration policy for long running java processes	2021-10-09 05:38:15 -07:00
snap	tweaking snap	2020-03-12 13:23:25 -07:00
test	adjust test code	2021-09-26 12:59:49 -07:00
unmaintained	stream read multiple volumes in a volume server	2021-09-27 02:51:31 -07:00
util	util: added gostd script	2019-04-30 03:23:20 +00:00
weed	go fmt	2021-10-11 23:23:46 -07:00
.gitignore	weed.go: remove unused parameter	2019-06-26 10:46:32 +08:00
backers.md	Update backers.md	2021-03-03 10:52:02 -08:00
go.mod	removing tikv to resolve "go mod tidy" problem	2021-10-10 19:27:02 -07:00
go.sum	removing tikv to resolve "go mod tidy" problem	2021-10-10 19:27:02 -07:00
LICENSE	clean up	2020-06-19 13:53:54 -07:00
README.md	Update README.md	2021-09-14 12:09:04 -07:00

README.md

SeaweedFS

Sponsor SeaweedFS via Patreon

SeaweedFS is an independent Apache-licensed open source project with its ongoing development made possible entirely thanks to the support of these awesome backers. If you'd like to grow SeaweedFS even stronger, please consider joining our sponsors on Patreon.

Your support will be really appreciated by me and other supporters!

Quick Start for S3 API on Docker

docker run -p 8333:8333 chrislusf/seaweedfs server -s3

Quick Start with single binary

Download the latest binary from https://github.com/chrislusf/seaweedfs/releases and unzip a single binary file weed or weed.exe
Run weed server -dir=/some/data/dir -s3 to start one master, one volume server, one filer, and one S3 gateway.

Also, to increase capacity, just add more volume servers by running weed volume -dir="/some/data/dir2" -mserver="<master_host>:9333" -port=8081 locally, or on a different machine, or on thousands of machines. That is it!

Introduction

SeaweedFS is a simple and highly scalable distributed file system. There are two objectives:

to store billions of files!
to serve the files fast!

SeaweedFS started as an Object Store to handle small files efficiently. Instead of managing all file metadata in a central master, the central master only manages volumes on volume servers, and these volume servers manage files and their metadata. This relieves concurrency pressure from the central master and spreads file metadata into volume servers, allowing faster file access (O(1), usually just one disk read operation).

There is only 40 bytes of disk storage overhead for each file's metadata. It is so simple with O(1) disk reads that you are welcome to challenge the performance with your actual use cases.

SeaweedFS started by implementing Facebook's Haystack design paper. Also, SeaweedFS implements erasure coding with ideas from f4: Facebook’s Warm BLOB Storage System, and has a lot of similarities with Facebook’s Tectonic Filesystem

On top of the object store, optional Filer can support directories and POSIX attributes. Filer is a separate linearly-scalable stateless server with customizable metadata stores, e.g., MySql, Postgres, Redis, Cassandra, HBase, Mongodb, Elastic Search, LevelDB, RocksDB, Sqlite, MemSql, TiDB, Etcd, CockroachDB, etc.

For any distributed key value stores, the large values can be offloaded to SeaweedFS. With the fast access speed and linearly scalable capacity, SeaweedFS can work as a distributed Key-Large-Value store.

SeaweedFS can transparently integrate with the cloud. With hot data on local cluster, and warm data on the cloud with O(1) access time, SeaweedFS can achieve both fast local access time and elastic cloud storage capacity. What's more, the cloud storage access API cost is minimized. Faster and Cheaper than direct cloud storage!

System	File Metadata	File Content Read	POSIX	REST API	Optimized for large number of small files
SeaweedFS	lookup volume id, cacheable	O(1) disk seek		Yes	Yes
SeaweedFS Filer	Linearly Scalable, Customizable	O(1) disk seek	FUSE	Yes	Yes
GlusterFS	hashing		FUSE, NFS
Ceph	hashing + rules		FUSE	Yes
MooseFS	in memory		FUSE		No
MinIO	separate meta file for each file			Yes	No

SeaweedFS	comparable to Ceph	advantage
Master	MDS	simpler
Volume	OSD	optimized for small files
Filer	Ceph FS	linearly scalable, Customizable, O(1) or O(logN)

README.md Unescape Escape

SeaweedFS

Sponsor SeaweedFS via Patreon

Gold Sponsors

Table of Contents

Quick Start for S3 API on Docker

Quick Start with single binary

Introduction

Additional Features

Filer Features

Kubernetes

Example: Using Seaweed Object Store

Start Master Server

Start Volume Servers

Write File

Save File Id

Read File

Rack-Aware and Data Center-Aware Replication

Allocate File Key on Specific Data Center

Other Features

Object Store Architecture

Master Server and Volume Server

Write and Read files

Storage Size

Saving memory

Tiered Storage to the cloud

Compared to Other File Systems

Compared to HDFS

Compared to GlusterFS, Ceph

Compared to GlusterFS

Compared to MooseFS

Compared to Ceph

Compared to MinIO

Dev Plan

Installation Guide

Disk Related Topics

Hard Drive Performance

Solid State Disk

Benchmark

License

Stargazers over time

README.md