This revision is from 2024/03/18 08:55. You can Restore it.
SearXNG installs itself on /usr/local/searxng/searxng-src, with the main source code in searxng-src directory.
To hack the results, the file is webapp.py in /usr/local/searxng/searxng-src/searx/webapp.py
The function in webapp.py is...
@app.route('/search', methods=[['GET', 'POST']])
def search():
A cache could work by...
- making a directory in the searx folder named cache
- make sub-folders in the cache directory from a to z and 0 to 9
- name the cache files after the search term
- check filename exists when a search is performed
- if there is a match read in the local file instead and avoid the search
- send the keywords to the maintainers so they can update the cache. They can then crawl the search engines and build a more comprehensive cache.
Proposed searXNG options:
- use cache
- update the cache
- disclosure to end user
Benefits:
- turns searXNG into a full search engine built from caching results
- searches are against a local file, so it speeds up searching significantly
- offline searching if the cache gets big enough
File in question:
/usr/local/searxng/searxng-src/searx/search/__init__.py : class Search
/usr/local/searxng/searxng-src/searx/ : webapp.py : def search() : search = SearchWithPlugins(search_query, request.user_plugins, request) # pylint: disable=redefined-outer-name
In class Search...
def search_multiple_requests(self, requests):
duplicated it with
def search_multiple_requests2(self, requests):
An if else clause based on if a cached results exist, to choose to return a cached version of do the real search.
class Search:
Something like.. in def search_standard(self):
if os.path.isfile(filepath):
search_multiple_requests2(self, requests) # cached version
else:
search_multiple_requests(self, requests) # do the real search
the original
def search_multiple_requests(self, requests):
# pylint: disable=protected-access
search_id = str(uuid4())
for engine_name, query, request_params in requests:
_search = copy_current_request_context(PROCESSORS[engine_name].search)
th = threading.Thread( # pylint: disable=invalid-name
target=_search,
args=(query, request_params, self.result_container, self.start_time, self.actual_timeout),
name=search_id,
)
th._timeout = False
th._engine_name = engine_name
th.start()
for th in threading.enumerate(): # pylint: disable=invalid-name
if th.name == search_id:
remaining_time = max(0.0, self.actual_timeout - (default_timer() - self.start_time))
th.join(remaining_time)
if th.is_alive():
th._timeout = True
self.result_container.add_unresponsive_engine(th._engine_name, 'timeout')
PROCESSORS[th._engine_name].logger.error('engine timeout')
The else def, return mock results
def search_multiple_requests2(self, requests):
# pylint: disable=protected-access
search_id = str(uuid4())
# Skip actual searching, assign mock result instead
mock_result_container = ResultContainer()
web_results = ['Mock Web Result 1', 'Mock Web Result 2', 'Mock Web Result 3']
# Ensure each result dictionary has a 'content' key
mock_web_results = [{'url': result, 'content': ''} for result in web_results]
mock_result_container.extend('web', mock_web_results)
self.result_container = mock_result_container
Trying to populate the class correctly for the mock search_multiple_requests2 is the nightmare.