prod-wag-backend-automate-s.../BankServices/ParserService/README.md

77 lines
2.3 KiB
Markdown

# Parser Service
## Overview
The Parser Service is the second component in the Redis pub/sub processing chain for bank-related email automation. It subscribes to messages with stage="red" from the Email Service, parses Excel attachments, and publishes the processed data back to Redis with stage="parsed".
## Features
### Redis Integration
- Subscribes to the "CollectedData" Redis channel for messages with stage="red"
- Processes Excel attachments contained in the messages
- Publishes parsed data back to Redis with stage="parsed" or "not found"
- Maintains message metadata and adds processing timestamps
### Excel Processing
- Parses bank statement Excel files
- Extracts transaction data including:
- IBAN numbers
- Transaction dates and times
- Currency values and balances
- Transaction types and references
- Branch information
### Error Handling
- Robust error management for Excel parsing
- Detailed logging of processing steps and errors
- Graceful handling of malformed messages
## Configuration
### Environment Variables
The service uses the same Redis configuration as the Email Service:
```
REDIS_HOST=10.10.2.15
REDIS_PORT=6379
REDIS_PASSWORD=your_strong_password_here
```
## Deployment
### Docker
The service is containerized using Docker and can be deployed using the provided Dockerfile and docker-compose configuration.
```bash
# Build and start the service
docker compose -f bank-services-docker-compose.yml up -d --build
# View logs
docker compose -f bank-services-docker-compose.yml logs -f parser_service
# Stop the service
docker compose -f bank-services-docker-compose.yml down
```
### Service Management
The `check_bank_services.sh` script provides a simple way to restart the service:
```bash
./check_bank_services.sh
```
## Architecture
### Redis Pub/Sub Chain
This service is the second in a multi-stage processing chain:
1. **Email Service**: Reads emails, extracts attachments, publishes to Redis with stage="red"
2. **Parser Service** (this service): Subscribes to stage="red" messages, parses Excel data, republishes with stage="parsed"
3. **Writer Service**: Subscribes to stage="processed" messages, writes data to final destination, marks as stage="completed"
## Development
### Dependencies
- Python 3.12
- Pandas and xlrd for Excel processing
- Redis for pub/sub messaging
- Arrow for date handling
- Unidecode for text normalization