Commit Graph

57 Commits

Author SHA1 Message Date
chrislu
90acfd9394 remove case when read request is out side of the file size 2024-11-05 08:42:44 -08:00
chrislu
98b519b113 fix FUSE mount on mac 2024-11-05 08:28:54 -08:00
Chris Lu
dc784bf217
merge current message queue code changes (#6201)
* listing files to convert to parquet

* write parquet files

* save logs into parquet files

* pass by value

* compact logs into parquet format

* can skip existing files

* refactor

* refactor

* fix compilation

* when no partition found

* refactor

* add untested parquet file read

* rename package

* refactor

* rename files

* remove unused

* add merged log read func

* parquet wants to know the file size

* rewind by time

* pass in stop ts

* add stop ts

* adjust log

* minor

* adjust log

* skip .parquet files when reading message logs

* skip non message files

* Update subscriber_record.go

* send messages

* skip message data with only ts

* skip non log files

* update parquet-go package

* ensure a valid record type

* add new field to a record type

* Update read_parquet_to_log.go

* fix parquet file name generation

* separating reading parquet and logs

* add key field

* add skipped logs

* use in memory cache

* refactor

* refactor

* refactor

* refactor, and change compact log

* refactor

* rename

* refactor

* fix format

* prefix v to version directory
2024-11-04 12:08:25 -08:00
Bruce
5428229347
fix file read crash (#6021) 2024-09-14 08:33:35 -07:00
Eugeniy E. Mikhailov
dab0bb8097
Feature limit caching to prescribed number of bytes per file (#6009)
* feature: we can check if a fileId is already in the cache

We using this to protect cache from adding the same needle to
the cache over and over.

* fuse mount: Do not start dowloader if needle is already in the cache

* added maxFilePartSizeInCache property to ChunkCache

If file very large only first maxFilePartSizeInCache bytes
are going to be put to the cache (subject to the needle size
constrains).

* feature: for large files put in cache no more than prescribed number of bytes

Before this patch only the first needle of a large file was intended for
caching. This patch uses maximum prescribed amount of bytes to be put in
cache. This allows to bypass default 2MB maximum for a file part stored
in the cache.

* added dummy mock methods to satisfy interfaces of ChunkCache
2024-09-11 21:09:20 -07:00
chrislu
18afdb15b6 Revert "weed mount, weed dav add option to force cache"
This reverts commit 7367b976b0.
2024-09-04 01:38:29 -07:00
chrislu
7367b976b0 weed mount, weed dav add option to force cache 2024-09-04 01:19:14 -07:00
zemul
0122e022ea
Mount concurrent read (#4400)
* fix:mount deadlock

* feat: concurrent read

* fix

* Remove useless code

* fix

---------

Co-authored-by: zemul <zhouzemiao@ihuman.com>
2023-04-13 22:32:45 -07:00
chrislu
bfe5d910c6 use one readerCache for the whole file 2023-01-16 22:43:02 -08:00
Chris Lu
d4566d4aaa
more solid weed mount (#4089)
* compare chunks by timestamp

* fix slab clearing error

* fix test compilation

* move oldest chunk to sealed, instead of by fullness

* lock on fh.entryViewCache

* remove verbose logs

* revert slat clearing

* less logs

* less logs

* track write and read by timestamp

* remove useless logic

* add entry lock on file handle release

* use mem chunk only, swap file chunk has problems

* comment out code that maybe used later

* add debug mode to compare data read and write

* more efficient readResolvedChunks with linked list

* small optimization

* fix test compilation

* minor fix on writer

* add SeparateGarbageChunks

* group chunks into sections

* turn off debug mode

* fix tests

* fix tests

* tmp enable swap file chunk

* Revert "tmp enable swap file chunk"

This reverts commit 985137ec47.

* simple refactoring

* simple refactoring

* do not re-use swap file chunk. Sealed chunks should not be re-used.

* comment out debugging facilities

* either mem chunk or swap file chunk is fine now

* remove orderedMutex  as *semaphore.Weighted

not found impactful

* optimize size calculation for changing large files

* optimize performance to avoid going through the long list of chunks

* still problems with swap file chunk

* rename

* tiny optimization

* swap file chunk save only successfully read data

* fix

* enable both mem and swap file chunk

* resolve chunks with range

* rename

* fix chunk interval list

* also change file handle chunk group when adding chunks

* pick in-active chunk with time-decayed counter

* fix compilation

* avoid nil with empty fh.entry

* refactoring

* rename

* rename

* refactor visible intervals to *list.List

* refactor chunkViews to *list.List

* add IntervalList for generic interval list

* change visible interval to use IntervalList in generics

* cahnge chunkViews to *IntervalList[*ChunkView]

* use NewFileChunkSection to create

* rename variables

* refactor

* fix renaming leftover

* renaming

* renaming

* add insert interval

* interval list adds lock

* incrementally add chunks to readers

Fixes:
1. set start and stop offset for the value object
2. clone the value object
3. use pointer instead of copy-by-value when passing to interval.Value
4. use insert interval since adding chunk could be out of order

* fix tests compilation

* fix tests compilation
2023-01-02 23:20:45 -08:00
chrislu
388f82f322 minor 2022-08-21 11:49:29 -07:00
chrislu
77e4b1376e refactoring 2022-08-21 11:35:54 -07:00
Patrick Schmidt
3f758820c1
Fix FUSE server buffer leaks in file gaps (#3472)
* Fix FUSE server buffer leaks in file gaps

This change zeros read buffers when encountering file gaps during
file/chunk reads in FUSE mounts.
It prevents leaking internal buffers of the FUSE server which could
otherwise reveal metadata, directory listings, file contents and
other data related to FUSE API calls.
The issue was that buffers are reused, but when a file gap was found
the buffer was not zeroed accordingly and the existing data of the
buffer was kept and returned.

* Move zero logic into its own method
2022-08-21 11:33:58 -07:00
Konstantin Lebedev
4d08393b7c
filer prefer volume server in same data center (#3405)
* initial prefer same data center
https://github.com/seaweedfs/seaweedfs/issues/3404

* GetDataCenter

* prefer same data center for ReplicationSource

* GetDataCenterId

* remove glog
2022-08-04 17:35:00 -07:00
chrislu
26dbc6c905 move to https://github.com/seaweedfs/seaweedfs 2022-07-29 00:17:28 -07:00
chrislu
f2f0482dd3 mount: random read also try to use the local cache first 2022-07-07 11:50:28 -07:00
chrislu
53513475bf mount: add back random read support
avoid too much memory used also
2022-03-13 01:38:52 -08:00
chrislu
941ced60a4 download 2 chunks if at the beginning of a file 2022-02-27 03:57:24 -08:00
chrislu
551d00d51a prefetch other chunks when stream reading 2022-02-26 23:20:45 -08:00
chrislu
3345a50d9b prefetch 2 chunks 2022-02-26 03:06:17 -08:00
chrislu
28b395bef4 better control for reader caching 2022-02-26 02:16:47 -08:00
banjiaojuhao
45e9c83421 padding zero for sparse file 2022-01-13 22:21:22 +08:00
chrislu
9a00c17555 reader: avoid wrong pattern detection due to lock waiting 2021-12-28 16:30:33 -08:00
chrislu
9f9ef1340c use streaming mode for long poll grpc calls
streaming mode would create separate grpc connections for each call.
this is to ensure the long poll connections are properly closed.
2021-12-26 00:15:03 -08:00
chrislu
b541e39a2c fix tests 2021-12-22 16:17:30 -08:00
chrislu
4c1368d621 fix test 2021-12-22 16:05:08 -08:00
chrislu
0cb9036f66 mount: only cache the first chunk on stream read 2021-12-19 23:06:03 -08:00
chrislu
a152f17937 mount: improve read performance on random reads 2021-12-19 22:43:14 -08:00
Chris Lu
1737af480a adjust logs 2021-05-10 21:47:51 -07:00
Nathan Hawkins
042de9359c make reader_at handle random reads more efficiently for FUSE 2021-04-28 19:13:37 -04:00
Chris Lu
990fa69bfe add back AdjustedUrl() related code 2021-01-28 14:36:29 -08:00
Chris Lu
00707ec00f mount: outsideContainerClusterMode proxy through filer
Running mount outside of the cluster would not need to expose all the volume servers to outside of the cluster. The chunk read and write will go through the filer.
2021-01-24 19:01:58 -08:00
Chris Lu
6ca10725b8 Revert "mount: when outside cluster network, use filer as proxy to access volume servers"
This reverts commit 096e088d7b.
2021-01-24 03:15:19 -08:00
Chris Lu
096e088d7b mount: when outside cluster network, use filer as proxy to access volume servers 2021-01-24 01:41:38 -08:00
Chris Lu
de876c795d minor fix 2021-01-18 01:14:27 -08:00
Chris Lu
2b76854641 add "weed filer.cat" to read files directly from volume servers 2021-01-06 04:22:00 -08:00
Chris Lu
8e78187a97 add back last read chunk cache to reader and properly close the reader 2020-12-08 22:26:46 -08:00
Chris Lu
900d22c6ec mount: avoid memory leaking read buffer
fix https://github.com/chrislusf/seaweedfs/issues/1654

the reader goes together with the file handle, which may stay for a long time.
2020-12-08 02:38:53 -08:00
Chris Lu
8750cac090 move to util.RetryWaitTime 2020-11-01 02:36:43 -08:00
Chris Lu
df8d976bb0 refactoring 2020-11-01 01:58:48 -07:00
Chris Lu
bd103c143a add lock for vidCache 2020-10-21 19:28:59 -07:00
Chris Lu
93bcf56514 file read report EOF
fix https://github.com/chrislusf/seaweedfs/issues/1344
2020-10-14 12:18:24 -07:00
Chris Lu
723ae11db4 refactoring in order to adjust volume server url later 2020-10-11 20:15:10 -07:00
Chris Lu
d155f907c2 mount: configurable read wait time 2020-10-10 20:09:43 -07:00
Chris Lu
8a52379ecb add retry if volume can not be found 2020-10-10 16:02:39 -07:00
Chris Lu
cff8bb6554 return proper error 2020-10-10 15:43:22 -07:00
Chris Lu
b2ee5873fb fix error not being returned 2020-10-08 23:19:20 -07:00
Chris Lu
eed492b73b randomize file locations 2020-10-07 23:58:32 -07:00
Chris Lu
a8624c2e4f read from alternative replica
related to https://github.com/chrislusf/seaweedfs/issues/1512
2020-10-07 22:49:04 -07:00
Chris Lu
36492c47ec adjust 2020-10-05 14:06:18 -07:00