There are a few external Java libraries available. But actually SeaweedFS already supports Hadoop compatible file system. There are Java code already working with SeaweedFS.
Here is an SeaweedFS Java API implementation refactored out of the existing code.
Build Java Client Jar
$cd $GOPATH/src/github.com/seaweedfs/seaweedfs/other/java/client
$ mvn install
Gradle
implementation 'com.github.chrislusf:seaweedfs-client:3.80'
Maven
<dependency>
<groupId>com.seaweedfs</groupId>
<artifactId>seaweedfs-client</artifactId>
<version>3.80</version>
</dependency>
Or you can download the latest version from MavenCentral
https://mvnrepository.com/artifact/com.github.chrislusf/seaweedfs-client
Features
Implemented APIs for file system storage. The blob storage APIs is not included.
- Efficient Read Write Path read and write directly to the volume servers, and only use filer servers for meta data. If you use http to read or write directly via filer, the data still needs to go through filer, which is less efficient.
- Monitor Filesystem Events You can watch all meta data changes recursively under any folder. Common
inotify
only support watch for events under one folder, and not recursively. - Read Ahead When reading large files and processing the current chunk, the next chunk will be pre-fetched.
Note
When creating a FilerClient
object, the port 18888
in the example is the default gRPC port.
When starting filer, weed filer -port=8888
, the port 8888 is default http port.
FilerClient filerClient = new FilerClient("localhost", 18888);
Read File
FilerClient filerClient = new FilerClient("localhost", 18888);
SeaweedInputStream seaweedInputStream = new SeaweedInputStream(filerClient, "/test.zip");
// next, you can use seaweedInputStream as a normal InputStream
Write File
FilerClient filerClient = new FilerClient("localhost", 18888);
SeaweedOutputStream seaweedOutputStream = new SeaweedOutputStream(filerClient, "/test/"+filename);
// next, you can use seaweedOutputStream as a normal OutputStream
Watch file changes
This API streams meta data changes.
The following is one implementation. It just watch the folder "/buckets" and all the meta data changes under the folder and all sub folders recursively. A bit more code, but should be powerful and simple to use.
FilerClient filerClient = new FilerClient("localhost", 18888);
long sinceNs = (System.currentTimeMillis() - 3600 * 1000) * 1000000L;
Iterator<FilerProto.SubscribeMetadataResponse> watch = filerClient.watch(
"/buckets",
"exampleClientName",
sinceNs
);
System.out.println("Connected to filer, subscribing from " + new Date());
while (watch.hasNext()) {
FilerProto.SubscribeMetadataResponse event = watch.next();
FilerProto.EventNotification notification = event.getEventNotification();
if (!event.getDirectory().equals(notification.getNewParentPath())) {
// move an entry to a new directory, possibly with a new name
if (notification.hasOldEntry() && notification.hasNewEntry()) {
System.out.println("moved " + event.getDirectory() + "/" + notification.getOldEntry().getName() + " to " + notification.getNewParentPath() + "/" + notification.getNewEntry().getName());
} else {
System.out.println("this should not happen.");
}
} else if (notification.hasNewEntry() && !notification.hasOldEntry()) {
System.out.println("created entry " + event.getDirectory() + "/" + notification.getNewEntry().getName());
} else if (!notification.hasNewEntry() && notification.hasOldEntry()) {
System.out.println("deleted entry " + event.getDirectory() + "/" + notification.getOldEntry().getName());
} else if (notification.hasNewEntry() && notification.hasOldEntry()) {
System.out.println("updated entry " + event.getDirectory() + "/" + notification.getNewEntry().getName());
}
}
Standard file manipulation
You can also use this API for standard file manipulation: directory listing, file touch, folder creation, file deletion, and recursive folder deletion.
FilerClient filerClient = new FilerClient("localhost", 18888);
List<FilerProto.Entry> entries = filerClient.listEntries("/");
for (FilerProto.Entry entry : entries) {
System.out.println(entry.toString());
}
filerClient.mkdirs("/new_folder", 0755);
filerClient.touch("/new_folder/new_empty_file", 0755);
filerClient.touch("/new_folder/new_empty_file2", 0755);
filerClient.rm("/new_folder/new_empty_file", false, true);
filerClient.rm("/new_folder", true, true);
Advanced Usage
Sometimes you may need to go deeper. For example, change modification time mtime
.
// load existing entry
FilerProto.Entry entry = filerClient.lookupEntry("/some/dir","entryName");
// change the attribute
FilerProto.Entry.Builder entryBuilder = FilerProto.Entry.newBuilder(entry);
FilerProto.FuseAttributes.Builder attrBuilder = FilerProto.FuseAttributes.newBuilder(entry.getAttributes());
attrBuilder.setMtime(...)
// save the new entry
entryBuilder.setAttributes(attrBuilder);
filerClient.updateEntry("/some/dir", entryBuilder.build());
Introduction
API
Configuration
- Replication
- Store file with a Time To Live
- Failover Master Server
- Erasure coding for warm storage
- Server Startup Setup
- Environment Variables
Filer
- Filer Setup
- Directories and Files
- Data Structure for Large Files
- Filer Data Encryption
- Filer Commands and Operations
- Filer JWT Use
Filer Stores
- Filer Cassandra Setup
- Filer Redis Setup
- Super Large Directories
- Path-Specific Filer Store
- Choosing a Filer Store
- Customize Filer Store
Advanced Filer Configurations
- Migrate to Filer Store
- Add New Filer Store
- Filer Store Replication
- Filer Active Active cross cluster continuous synchronization
- Filer as a Key-Large-Value Store
- Path Specific Configuration
- Filer Change Data Capture
FUSE Mount
WebDAV
Cloud Drive
- Cloud Drive Benefits
- Cloud Drive Architecture
- Configure Remote Storage
- Mount Remote Storage
- Cache Remote Storage
- Cloud Drive Quick Setup
- Gateway to Remote Object Storage
AWS S3 API
- Amazon S3 API
- AWS CLI with SeaweedFS
- s3cmd with SeaweedFS
- rclone with SeaweedFS
- restic with SeaweedFS
- nodejs with Seaweed S3
- S3 API Benchmark
- S3 API FAQ
- S3 Bucket Quota
- S3 API Audit log
- S3 Nginx Proxy
- Docker Compose for S3
AWS IAM
Machine Learning
HDFS
- Hadoop Compatible File System
- run Spark on SeaweedFS
- run HBase on SeaweedFS
- run Presto on SeaweedFS
- Hadoop Benchmark
- HDFS via S3 connector
Replication and Backup
- Async Replication to another Filer [Deprecated]
- Async Backup
- Async Filer Metadata Backup
- Async Replication to Cloud [Deprecated]
- Kubernetes Backups and Recovery with K8up
Messaging
Use Cases
Operations
Advanced
- Large File Handling
- Optimization
- Volume Management
- Tiered Storage
- Cloud Tier
- Cloud Monitoring
- Load Command Line Options from a file
- SRV Service Discovery
- Volume Files Structure