In profiling integration tests, I found a couple places where per-test
overhead could be reduced:
* Avoiding disk IO by synchronizing instead of deleting & copying test
Git repository data. This saves ~100ms per test on my machine
* When flushing queues in `PrintCurrentTest`, invoke `FlushWithContext`
in a parallel.
---------
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
This is a large and complex PR, so let me explain in detail its changes.
First, I had to create new index mappings for Bleve and ElasticSerach as
the current ones do not support search by filename. This requires Gitea
to recreate the code search indexes (I do not know if this is a breaking
change, but I feel it deserves a heads-up).
I've used [this
approach](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/analysis-pathhierarchy-tokenizer.html)
to model the filename index. It allows us to efficiently search for both
the full path and the name of a file. Bleve, however, does not support
this out-of-box, so I had to code a brand new [token
filter](https://blevesearch.com/docs/Token-Filters/) to generate the
search terms.
I also did an overhaul in the `indexer_test.go` file. It now asserts the
order of the expected results (this is important since matches based on
the name of a file are more relevant than those based on its content).
I've added new test scenarios that deal with searching by filename. They
use a new repo included in the Gitea fixture.
The screenshot below depicts how Gitea shows the search results. It
shows results based on content in the same way as the current version
does. In matches based on the filename, the first seven lines of the
file contents are shown (BTW, this is how GitHub does it).
![image](https://github.com/user-attachments/assets/9d938d86-1a8d-4f89-8644-1921a473e858)
Resolves#32096
---------
Signed-off-by: Bruno Sofiato <bruno.sofiato@gmail.com>
assert.Fail() will continue to execute the code while assert.FailNow()
not. I thought those uses of assert.Fail() should exit immediately.
PS: perhaps it's a good idea to use
[require](https://pkg.go.dev/github.com/stretchr/testify/require)
somewhere because the assert package's default behavior does not exit
when an error occurs, which makes it difficult to find the root error
reason.
This PR removed `unittest.MainTest` the second parameter
`TestOptions.GiteaRoot`. Now it detects the root directory by current
working directory.
---------
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
Refactor `modules/indexer` to make it more maintainable. And it can be
easier to support more features. I'm trying to solve some of issue
searching, this is a precursor to making functional changes.
Current supported engines and the index versions:
| engines | issues | code |
| - | - | - |
| db | Just a wrapper for database queries, doesn't need version | - |
| bleve | The version of index is **2** | The version of index is **6**
|
| elasticsearch | The old index has no version, will be treated as
version **0** in this PR | The version of index is **1** |
| meilisearch | The old index has no version, will be treated as version
**0** in this PR | - |
## Changes
### Split
Splited it into mutiple packages
```text
indexer
├── internal
│ ├── bleve
│ ├── db
│ ├── elasticsearch
│ └── meilisearch
├── code
│ ├── bleve
│ ├── elasticsearch
│ └── internal
└── issues
├── bleve
├── db
├── elasticsearch
├── internal
└── meilisearch
```
- `indexer/interanal`: Internal shared package for indexer.
- `indexer/interanal/[engine]`: Internal shared package for each engine
(bleve/db/elasticsearch/meilisearch).
- `indexer/code`: Implementations for code indexer.
- `indexer/code/internal`: Internal shared package for code indexer.
- `indexer/code/[engine]`: Implementation via each engine for code
indexer.
- `indexer/issues`: Implementations for issues indexer.
### Deduplication
- Combine `Init/Ping/Close` for code indexer and issues indexer.
- ~Combine `issues.indexerHolder` and `code.wrappedIndexer` to
`internal.IndexHolder`.~ Remove it, use dummy indexer instead when the
indexer is not ready.
- Duplicate two copies of creating ES clients.
- Duplicate two copies of `indexerID()`.
### Enhancement
- [x] Support index version for elasticsearch issues indexer, the old
index without version will be treated as version 0.
- [x] Fix spell of `elastic_search/ElasticSearch`, it should be
`Elasticsearch`.
- [x] Improve versioning of ES index. We don't need `Aliases`:
- Gitea does't need aliases for "Zero Downtime" because it never delete
old indexes.
- The old code of issues indexer uses the orignal name to create issue
index, so it's tricky to convert it to an alias.
- [x] Support index version for meilisearch issues indexer, the old
index without version will be treated as version 0.
- [x] Do "ping" only when `Ping` has been called, don't ping
periodically and cache the status.
- [x] Support the context parameter whenever possible.
- [x] Fix outdated example config.
- [x] Give up the requeue logic of issues indexer: When indexing fails,
call Ping to check if it was caused by the engine being unavailable, and
only requeue the task if the engine is unavailable.
- It is fragile and tricky, could cause data losing (It did happen when
I was doing some tests for this PR). And it works for ES only.
- Just always requeue the failed task, if it caused by bad data, it's a
bug of Gitea which should be fixed.
---------
Co-authored-by: Giteabot <teabot@gitea.io>
Change all license headers to comply with REUSE specification.
Fix#16132
Co-authored-by: flynnnnnnnnnn <flynnnnnnnnnn@github>
Co-authored-by: John Olheiser <john.olheiser@gmail.com>
This PR continues the work in #17125 by progressively ensuring that git
commands run within the request context.
This now means that the if there is a git repo already open in the context it will be used instead of reopening it.
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Add missing `X-Total-Count` and fix some related bugs
Adds `X-Total-Count` header to APIs that return a list but doesn't have it yet.
Fixed bugs:
* not returned after reporting error (39eb82446c/routers/api/v1/user/star.go (L70))
* crash with index out of bounds, API issue/issueSubscriptions
I also found various endpoints that return lists but do not apply/support pagination yet:
```
/repos/{owner}/{repo}/issues/{index}/labels
/repos/{owner}/{repo}/issues/comments/{id}/reactions
/repos/{owner}/{repo}/branch_protections
/repos/{owner}/{repo}/contents
/repos/{owner}/{repo}/hooks/git
/repos/{owner}/{repo}/issue_templates
/repos/{owner}/{repo}/releases/{id}/assets
/repos/{owner}/{repo}/reviewers
/repos/{owner}/{repo}/teams
/user/emails
/users/{username}/heatmap
```
If this is not expected, an new issue should be opened.
Closes#13043
* fmt
* Update routers/api/v1/repo/issue_subscription.go
Co-authored-by: KN4CK3R <admin@oldschoolhack.me>
* Use FindAndCount
Co-authored-by: KN4CK3R <admin@oldschoolhack.me>
Co-authored-by: 6543 <6543@obermui.de>
* Some refactors related repository model
* Move more methods out of repository
* Move repository into models/repo
* Fix test
* Fix test
* some improvements
* Remove unnecessary function
* Support elastic search for code search
* Finished elastic search implementation and add some tests
* Enable test on drone and added docs
* Add new fields to elastic search
* Fix bug
* remove unused changes
* Use indexer alias to keep the gitea indexer version
* Improve codes
* Some code improvements
* The real indexer name changed to xxx.v1
Co-authored-by: zeripath <art27@cantab.net>