Caching SearXNG

This revision is from 2024/03/17 17:39. You can Restore it.

SearXNG installs itself on /usr/local/searxng/searxng-src, with the main source code in searxng-src directory.

To hack the results, the file is webapp.py in /usr/local/searxng/searxng-src/searx/webapp.py

The function in webapp.py is...

@app.route('/search', methods=[['GET', 'POST']])

def search():

A cache could work by...

  • making a directory in the searx folder named cache
  • make sub-folders in the cache directory from a to z and 0 to 9

  • name the cache files after the search term
  • check filename exists when a search is performed
  • if there is a match read in the local file instead and avoid the search
  • send the keywords to the maintainers so they can update the cache. They can then crawl the search engines and build a more comprehensive cache.

Proposed searXNG options:

  • use cache
  • update the cache
  • disclosure to end user

Benefits:

  • turns searXNG into a full search engine built from caching results
  • searches are against a local file, so it speeds up searching significantly
  • offline searching if the cache gets big enough

Refactoring def search():

# Dump results into a cache file named after the keyword # Sanitize the search query string for use as a filename

first_char = save_q[[0]].lower()

subdir = f'cache/{first_char}'

sanitized_query = re.sub(r'[/:?"*|<>']', '_', save_q)

filename = f'{sanitized_query}'

filepath = os.path.join(subdir, filename)

with open(filepath, 'w') as file:

file.write(str(results))

# Load results from the file if it exists instead of searching

if os.path.isfile(filepath):

with open(filepath, 'r') as file:

results = [[]]

for line in file:

# Assuming each line in the file represents a JSON string for a single result

result_dict = json.loads(line.strip())

results.append(result_dict)

else:

# Fetch results from the API or other source

results = result_container.get_ordered_results()

  

📝 📜 ⏱️ ⬆️