Commit Graph

14 Commits

Author SHA1 Message Date
652261b774 Added support for known hub URLs in scout_logic.py to handle Playwright timeouts and errors more effectively. Updated fetching logic to prioritize known URLs when encountering issues, enhancing reliability in link extraction. 2026-01-31 18:42:42 +01:00
8000642eae Updated scout_logic.py to clarify the strategy for identifying the entry or overview page (Hub) for research reports, enhancing the prompt for link analysis to focus on Hub URLs rather than final download pages. 2026-01-31 18:39:08 +01:00
f7b328b7f2 Refined timeout strategy in scout_logic.py for URL fetching, introducing separate timeouts for 'commit' and 'domcontentloaded' states, and enhanced logging for better error visibility during page loading attempts. 2026-01-31 18:33:44 +01:00
beb80e9eaf Refactored timeout handling in scout_logic.py to improve URL fetching reliability, added detailed logging for error tracking, and implemented a total timeout for Playwright operations to prevent indefinite hangs. 2026-01-31 18:28:26 +01:00
b3e9a6455b Enhanced main.py and scout_logic.py with improved timeout handling for URL fetching, added logging for better request tracking, and optimized page loading strategy to prevent hangs on heavy pages. 2026-01-31 18:25:23 +01:00
46b59d2c5c Updated docker-compose.yml to clarify port mapping, modified Dockerfile.worker to enable access logging, and added logging functionality in main.py for request tracking. 2026-01-31 18:19:41 +01:00
3542e4564b Updated docker-compose.yml to change the port mapping from 8000 to 8010 for the worker service. 2026-01-31 18:15:15 +01:00
9dd44af2d4 Updated scout_logic.py to use the new Stealth class for bot detection evasion, replacing the previous stealth_async function call. 2026-01-31 18:13:40 +01:00
a18801a6aa Refactored import statement for playwright_stealth in scout_logic.py to align with updated package structure. 2026-01-31 18:11:33 +01:00
9c5f769455 Added playwright-stealth dependency and refactored link fetching logic in scout_logic.py to enhance bot detection evasion and implement HTTP/2 fallback handling. 2026-01-31 18:08:40 +01:00
afee46933f Enhanced scout_logic.py with improved browser configuration to bypass bot detection, added URL normalization functions, and implemented robust error handling for fetching links. 2026-01-31 18:04:49 +01:00
6e813daf69 Updated project structure and added initial configuration files. 2026-01-31 17:36:48 +01:00
bd0e602b09 initialer Setup 2026-01-31 12:11:16 +01:00
24964cd507 readme hinzugefügt 2026-01-31 12:02:18 +01:00