This repository contains a script to cache OWLERY queries for Virtual Fly Brain (VFB) by running all possible queries with all potential anatomy IDs.
After each release of VFB, the OWLERY query server needs to have its cache populated with results for all possible queries to ensure fast response times for user queries.
The script extracts OWLERY queries from the queries_execution_notebook.ipynb, determines the restrictions on potential IDs (anatomy classes), uses VFBconnect to pull all potential anatomy IDs from the PDB database, and then runs each query against the OWLERY server to cache the results.
The script main.py:
- Connects to the VFB database using VFBconnect.
- Retrieves all anatomy class short_form IDs using a Cypher query.
- Sorts the IDs in descending order to process the newest ones first.
- For each predefined OWLERY query and each anatomy ID, constructs the query URL and sends a GET request to the OWLERY server.
- Runs queries concurrently (up to the specified number of parallel requests per ID) to speed up caching.
- Logs a success indicator (✓) with result count for successful queries, or error details with URL for failures.
Run with:
source .venv/bin/activate
python main.py [--max-ids N] [--timeout T] [--parallel P]
Where --max-ids N limits to the first N IDs per query for testing (optional), --timeout T sets the timeout in seconds for each request (default 60), and --parallel P sets the number of parallel requests to run at once (default 9).
Each request has a configurable timeout (default 60 seconds). Some queries may timeout, but the cache will still be populated for successful ones.
The script is designed to run in a Jenkins job with Python 3.10 after each VFB release.
Create and activate a virtual environment:
python3 -m venv .venv
source .venv/bin/activate
Install with:
pip install -r requirements.txt
.venv/: Python virtual environment..gitignore: Git ignore file.main.py: The main script.requirements.txt: Python dependencies.LICENSE: MIT License.README.md: This documentation.